Difference between revisions of "Running comet on a cluster"

From CometWiki
Jump to: navigation, search
m (Created page with 'We highly recommend running 'comet' on a cluster of machines, rather than one. The program takes some time to run (an hour or two for early events like symmetry breaking; overni...')
 
Line 1: Line 1:
We highly recommend running 'comet' on a cluster of machines, rather than one.  The program takes some time to run (an hour or two for early events like symmetry breaking; overnight if you want to look at the later motility, pulsatile motion etc.)  Running the program concurrently on at least 5 machines (I used 9) lets you test the effect of changing one parameter through a range of values.  You can set it going then come back the following day and have the whole thing layed out for you.  I've found this very useful, and have included a set of scripts to automate the process.
+
We highly recommend running 'comet' on a cluster of machines, rather than one.  The program takes some time to run (an hour or two for early events like symmetry breaking; overnight if you want to look at the later motility, pulsatile motion etc.)  Running the program concurrently on at least 5 machines (I used 9) lets you test the effect of changing one parameter through a range of values.  You can set it going then come back the following day and have the whole thing layed out for you.  I've found this very useful, and have included a set of scripts to automate the process, including automatically generating the montage of images seen in the robustness section so you can quickly scan the effect of varying the parameter.
  
  
Line 16: Line 16:
 
  startjobsloop
 
  startjobsloop
 
which will check ~/joblist for new jobs and run them sequentially.
 
which will check ~/joblist for new jobs and run them sequentially.
 +
 +
 +
The script
 +
makematrix
 +
pulls together an image matrix (as seen in the robustness section) to summarize the effect of varying the parameter.  The directory name, time, computer and main section of the competparams.ini file are converted into an image and included on the left hand side of the summary, to keep track of the details of the run.

Revision as of 09:59, 13 April 2009

We highly recommend running 'comet' on a cluster of machines, rather than one. The program takes some time to run (an hour or two for early events like symmetry breaking; overnight if you want to look at the later motility, pulsatile motion etc.) Running the program concurrently on at least 5 machines (I used 9) lets you test the effect of changing one parameter through a range of values. You can set it going then come back the following day and have the whole thing layed out for you. I've found this very useful, and have included a set of scripts to automate the process, including automatically generating the montage of images seen in the robustness section so you can quickly scan the effect of varying the parameter.


I recommend putting a default cometparams.ini into a main directory for the data e.g. ~/runs. In that directory run the varyset script (included with the source):

varyset <parameter> <startval> <endval> <number of steps>

This will create a subdirectory within runs that contains subdirectories numbered 1,2,3,etc. each containing a version of the cometparams.ini file with the <parameter> value varying in linear steps between <startval> and <endval>. It will also add information to run the individual comet jobs into ~/joblist.

If you have access to a cluster with a working job control system, you might want to use that. I had trouble with the job control system on the cluster I was using, and ended up writing my own:

On the head node, I have the

startnewjobs

script running as a cron job. This checks to see if the worker nodes are idle (5 min load average below a certain threshold) and starts the next job if they are.


If you don't have access to a cluster, there is a single computer version of this

startjobsloop

which will check ~/joblist for new jobs and run them sequentially.


The script

makematrix

pulls together an image matrix (as seen in the robustness section) to summarize the effect of varying the parameter. The directory name, time, computer and main section of the competparams.ini file are converted into an image and included on the left hand side of the summary, to keep track of the details of the run.