James Gardner


MogileFS on Debian Etch

Posted in XML by thejimmyg on the April 10th, 2008

Caution!

Although I’ve got MogileFS working there was a bit of trial and error involved. I believe what I’ve documented here is the path I have taken but I haven’t double checked it.

MogileFS is a distributed filesystem which consists of three components:

  • SQL database to store filesystem data
  • Storage nodes for holding the actual files
  • Tracker nodes to keep track of the files

http://www.danga.com/mogilefs/usage.bml

You can run the storage daemon and tracker daemon on the same machine and can even host the database on the same machine too. The whole point of MogileFS is to distribute files having just one server wouldn’t make a lot of sense although this is what we’ll do in this blog to setup a test environment.

Install MySQL (otherwise the tests later won’t work):

apt-get install mysql-server-5.0

Install the build tools:

apt-get install build-essential

Get the latest code:

apt-get install subversion
svn co http://code.sixapart.com/svn/mogilefs/trunk mogilefsd

You’ll need some Perl modules from CPAN. You can configure CPAN like this:

perl -MCPAN -e shell

Use the defaults for all the questions until you have to choose a mirror, then select the ones closest to your server.

At the cpan> prompt type this:

install Danga::Socket

You might get this warning but press enter to choose yes and it Sys::Syscall will be built for you:

---- Unsatisfied dependencies detected during [M/MS/MSERGEANT/Danga-Socket-1.59.tar.gz] -----
    Sys::Syscall
Shall I follow them and prepend them to the queue
of modules we are processing right now? [yes]

Next install DBI and Net::Netmask:

install DBI
install Net::Netmask

Now you’ll need Gearman::Client and once again you’ll need to choose to have its dependencies installed:

install Gearman::Client

Finally install Gearman::Client::Async, Gearman::Server, Perlbal, Mysql, IO::AIO accepting the dependencies and answering with the default to all the questions:

install Gearman::Server
install Gearman::Client::Async
install Perlbal
install Mysql
install IO::AIO

When you are done, quit:

cpan> quit

Now try to compile the server components:

cd mogilefs/server
perl Makefile.PL
make
make test
make install

If you get any messages about missing dependencies when running perl Makefile.PL you’ll need to install them via CPAN as you did with the other modules above.

Now exit and install the API (although we’ll use the Python version eventually):

cd ../api/perl/MogileFS-Client
perl Makefile.PL
make
make test

cd ../MogileFS-Client-FilePaths
perl Makefile.PL
make

Now install the utils:

cd ../utils
perl Makefile.PL
make
make test
make install

Next you can create the database:

# mysql
mysql> CREATE DATABASE mogilefs;
mysql>  GRANT ALL ON mogilefs.* TO 'mogile'@'%' IDENTIFIED BY 'sekrit';
mysql> FLUSH PRIVILEGES;
mysql> quit

Now run the database setup script to add tables to the database:

mogdbsetup --dbhost=localhost --dbname=mogilefs --dbuser=mogile --dbpass=sekrit

Create a folder for the configuration:

mkdir /etc/mogilefs

Then create a file /etc/mogilefs/mogilefsd.conf and add the following information:

db_dsn DBI:mysql:mogilefs:localhost
db_user mogile
db_pass sekrit
conf_port 6001
listener_jobs 5

db_user and db_pass should match the user and password you configured when setting up your database.

Create a user for the MogileFS tracker daemon because it won’t run as root:

adduser mogile

You can now start the trackers:

# su mogile
$ mogilefsd -c /etc/mogilefs/mogilefsd.conf --daemon
$ exit

You can confirm that the trackers are running with the following command:

# ps aux | grep mogilefsd

If you don’t get a list of running processes the trackers are not running.

Now that the MySQL database is setup and the tracker daemon is running we need to setup the storage server. Create a folder for the data:

mkdir /var/mogdata

then create a config file /etc/mogilefs/mogstored.conf:

httplisten=0.0.0.0:7500
mgmtlisten=0.0.0.0:7501
docroot=/var/mogdata

Now you can start the storage server daemons:

# mogadm --trackers=localhost:6001 host add mogilestorage --ip=127.0.0.1 --port=7500 --status=alive

You can confirm that your host(s) were added with the following command:

# mogadm --trackers=localhost:6001 host list
mogilestorage [1]: alive
  IP:       127.0.0.1:7500

Now add a device:

mogadm --trackers=localhost:6001 device add mogilestorage 1

You can lost the devices like this:

mogile@vm1:/root$ mogadm --trackers=localhost:6001 device list
mogilestorage [1]: alive
                   used(G) free(G) total(G)
  dev1: alive      0.000   0.000   0.000

You’ll then need to create a folder for that device:

mkdir /var/mogdata/dev1

Make sure it is owned by mogile:

chown mogile:mogile /var/mogdata/dev1

Now you can start the storage daemon:

mogstored --daemon

The following example would check all mogile components using the trackers at IP address 192.168.42.1 and 192.168.42.2, both listening on port 6001:

mogadm --trackers=localhost:6001 check

You can specify multiple trackers to test by listing them as a comma separated list.

Now it is all set up you can try it with some real data. Each file you store has to be in a domain and have a particular class so first we need to setup a domain and a class within that domain:

mogadm --trackers=localhost:6001 domain add testdomain

Add a class to the domain:

mogadm --trackers=localhost:6001 class add testdomain testclass

For the Python client:

apt-get install python python-pycurl

Then download it:

wget http://www.albany.edu/~ja6447/mogilefs.py

Create test.py:

from mogilefs import *

def test():

    a = Admin(trackers=['localhost:6001'])
    print a.get_devices()

    good = open("/etc/motd").read()
    c=Client(domain='testdomain', trackers=['localhost:6001'], root='/var/mogdata')
    c.delete('/etc/motd')
    c.send_file('/etc/motd', '/etc/motd')
    assert(c.get_file_data('/etc/motd') == good)
    c.delete('/etc/motd_0')
    c.rename("/etc/motd", "/etc/motd_0")
    for x in range(10):
        c.delete('/etc/motd_%d' % (x+1))
        c.rename("/etc/motd_%d" % x, "/etc/motd_%d" % (x+1))
        data = c.get_file_data('/etc/motd_%d' % (x+1))
        assert data == good

if __name__ == '__main__':
    test()

Then test it with:

python test.py

You can also manipulate files via the command line with the mogadm tool.

<!–wp-footer–><u style="display: none;"><a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Klipal">klipal discount retail</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Zyban">buy zyban online</a> <a href="http://www.gimmepills.com/pill/Viagra.html">cheap generic viagra</a> <a href="http://pharma-drug.com/">levitra online</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Oxazepam">female version of oxazepam</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Xanax">whats a female xanax</a> <a href="http://www.gimmepills.com/pill/Levitra.html">generic levitra</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Lorazepam">cheapest lorazepam uk</a> <a href="http://www.pills-o-matic.com/">Fluoxetine</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Valium">purchase valium on line</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Tramadol">pfizer tramadol sales uk</a> <a href="http://www.gimmepills.com/pill/Cialis.html">buy cialis</a> <a href="http://rx-pillsonline.com/">canadian pharmacy</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Diazepam">oils for female diazepam</a> <a href="http://www.pills-o-matic.com/cart.php?action=view_subcat&#038;subcat=Tenuate">female version of tenuate</a> </u><!–wp-footer–>

Good Ingredients, AWS and App Engine

Posted in Python, Web, Debian, EC2, Hosting by thejimmyg on the April 8th, 2008

So, today’s been an interesting day. I finally decided on Sunday against building my future projects with Amazon web services on the basis that I’m more likely to be able to build a faster responding and more tailored solution to each of my needs with other tools. I decided to investigate MogileFS as an alternative to Amazon S3 and at the same time I decided to start work on a new set of Python tools to be named "good ingredients" which would basically be very low level components using a more genericised version of the WSGI spec which also allows services (something I’ve been thinking about recently). I mentioned some of my ideas to some of the Rails community including one of the core comitters but didn’t get a lot of interest. I’m not too suprised becuase I haven’t had a lot of interest from the Pylons community either and the Rails community are far more in favour of abstractions than the Pylons community and abstractions are precicely what good ingredients were not going to be about.

Anyway, the plan was to put together existing tools that provide the sorts of very scalable infrastructure which would otherwise require something like AWS and expose useful a useful low-level web framework stack with the good ingredients.

Then after all this planning Google go an release their App Engine which offers both a scalable infrastructure and a new set of Python modules which seem farily low-level and support WSGI. Still the infrastructure isn’t customisable and the modules don’t use anything akin to the modular WSGI services API I am planning so I think it makes sense for me to carry on with my ideas anyway. Interesting times though.