HOWTO PARALLELIZE MIGRATE ========================= Peter Beerli beerli@csit.fsu.edu [updated August 04 2003] Contents: This text describes how you can improve the performance of migrate when you have more than one locus and more than one computer at your fingertips. You can parallelize migrate runs (I) using a virtual parallel architecture with a message-passing interface (MPI) or (II) by hand. The hand-version works but is cumbersome, the MPI-version runs fine on clusters of MacOSX workstations, dedicated clusters of Linux machines, AIX parallel machines (Regatta; SP3, SP4). I. Message passing interface ==================================== (1) Secure as many computers for the analysis as you have loci or parameters in your dataset. Make sure that all computers can talk to each other. Currently my program will only work if they are a flavor of UNIX (e.g. LINUX or MACOSX). Of course, you need an account on all the machines. - Download LAM from http://www.lam-mpi.org [I use version 7.0] - install LAM on all machines (if this is to complicated for you ask a sysadmin or other guru to help: ./configure --without-fc --without-romio --with-rsh=ssh [the with-rsh=ssh is optional, if you are behind a firewall you might not need this] - prepare a file lamhosts according to the specs in the lam distribution, the master node needs to be the first machine mentioned. my lamhosts looks like this: ciguri cpu=2 zork cpu=1 nagual cpu=32 - make sure that you can access all machines using ssh (I use openssh) without the need to specify a password, see man ssh-keygen and man ssh if you have firewalls installed on your individual systems then you would need to allow the individual machines to open/request "random" ports on the other machines. - change into the migrate-1.7/src/ directory configure and then use "make mpis-thread" to compile with the pthread library, or or with "make mpis", perhaps you want to rename the binary because it will NOT WORK as a standard, single process program. (2) If your machines have no cross-mounted file system, you need to make sure that the program is all in the same path e.g. /home/beerli/migrate-test/migrate-n and on EVERY machine, but there are ways around this limitation, consult the LAM manual. ssh to your slowest machine [the master is not doing much except scheduling the workers] then try: - recon -v lamhosts [if this fails try to resolve the problems, most often difficulties to connect] - lamboot -v lamhosts [if the above command worked, this normally works, too. If you run a firewall and forgot about it this command will fail] (3) Compile migrate, you need to follow the instructions in README. Essenetially you need to do ./configure make mpis The configure command sets up the Makefile etc. make mpis compiles for parallel machines fro slow architectures (this include Gigabit switches), even on dedicated parallele machines it seems that using mpis. (4) Try run the following command from the src directory, after you copied an infile into the directory. - mpirun -np 7 migrate-n [ watch in awe, that 6 loci can get analyzed at once, the log is not very comprehensive because all 7 processes write to the same console, 7 because there is one master-node who does only scheduling and maximization, 6 worker-nodes do the actual tree rearrangements and the likelihood calculations. the number you specify has nothing to do with the physical computers, LAM can run several nodes on a single CPU. - lamhalt -v will stop your parallel machine [on MACOSX you may need to rm the temporary directory by hand] - send comments how it worked for you and improvements for my [currently] too short and confusing guide. II. BY HAND ========================================================= (1) Secure as many computers for the analysis as you have loci in your dataset. (2) On one machine prepare a directory with - migrate-n run the program once, and adjust the run parameters using the menu. Use the sumfile option in the (I)nput menu and then save the parmfile with the (W)rite parmfile option. Then (Q)uit. Edit the created parmfile and check if you can find write-sumfile=YES then change menu=YES to menu=NO (2) Copy this directory on each machine and name the directories e.g. locus1 locus2 ..... If you use Appleshare be careful that you have also directories for each locus. (3) Prepare the infiles. One for each locus Copy the infiles into the directories. (4) Start migrate-n on all machines (5) Once all the migrate-n runs have finished, copy all sumfiles onto a single machine it would be helpful if this is your fastest with lots of RAM. Be careful not ot overwrite individual files (the have the same name" sumfile"). (6) Concatenate the sumfiles (7) The combined sumfile needs hand editing or you can use the PERL script concat-sumfile if you cannot run the PERL script or want to do it by hand, see the example below. (8) make a save copy of the fixed combined sumfile (9) run migrate-n and use option (D)atatype and there (g)enealogy and change other menu items if you want. (10) voila, a multilocus outfile in a fraction of the time the program needs to run on a single machine. ========================================================================= What to edit in a sumfile (1) the heading of a sumfile needs the two first comment lines # begin genealogy-summary file of migrate 0.9.8 ------ # (2) the third line needs editing, the first number is the number of loci for single locus data it is 1, change it to the number of loci 1 3 9 0 1 [before] 4 3 9 0 1 [after, 4 loci] (3) Search for ####### you will find lines like the following 0 0 ####### locus 0, replicate 0 ################ the file start couting with 0, so the lines reads locus 1 and replicate 1 leave the first occurrence as it is. Goto the end of the file and remove # end genealogy-summary file of migrate 0.9.8 ------ (4) Prepare the next sumfile.x to the master sumfile - Remove everything above 0 0 ######## locus 0, replicate 0 ..... - change the number to 1 0 ######## locus 1, ..... if you use replicates you need to change the replicates accordingly. - Remove the last line [except for the very last sumfile (5) concatenate the above sumfile-fragment to the master sumfile (6) Goto (4) until done ====================================================================== Example of a sumfile # begin genealogy-summary file of migrate 0.9.8 ------ # 1 3 9 0 1 0 0 ####### locus 0, replicate 0 ################ <<<<<<<<<<<<< $Id: HOWTO-PARALLEL 175 2005-12-02 14:51:23Z beerli $