This tutorial will guide you through FireWorks installation on worker nodes, test communication with the queue, and submit some placeholder jobs.
To install FireWorks on the worker, please follow the instructions listed at Installation on a machine.
After installation, the script rocket_launcher_run.py should have been added to your system path. The rocket launcher creates directories on your file system to contain your runs, and also submits jobs to the queue.
You should now be able to run the rocket launcher command as an executable. Type this command into your command prompt (from any directory) to ensure that the script is found:
rocket_launcher_run.py -h
This command should print out more detailed help about the rocket launcher. Take a minute to read it over; it might not all be clear, but we’ll step through some of the rocket launcher features next.
We now want to test interaction of our worker computer with the queue.
Important
If your queuing system does not have built-in support, do not despair! You might try to use pbs_adapter.py as a reference for coding your own PBS adapter. TODO: add docs!! Otherwise, you can contact us for help (see Contributing).
Important
The specific format of the JobParameters file might vary with the QueueAdapter, as each queuing system might require slightly different parameters. Refer to your QueueAdapter’s documentation to see how to correctly structure a JobParameters file. For example, the PBSAdapterNERSC defined in pbs_adapter.py accepts many more parameters than those listed in the test JobParameters file.
Important
Ensure that the ‘exe’ parameter in the JobParameters file reads: “echo ‘howdy, your job launched successfully!’ >> howdy.txt”
Important
Make sure the qa_name parameter in the JobParameters file indicates the name of your desired queue adapter.
Try submitting a job using the command:
rocket_launcher_run.py <JOB_PARAMETERS_FILE>
where the <JOB_PARAMETERS_FILE> points to your JobParameters file, e.g. job_params_pbs_nersc.yaml.
If you finished this part of the tutorial successfully, congratulations! You’ve successfully set up a worker node to run FireWorks. You can now continue to test launching jobs in a “rapid-fire” mode.
While launching a single job is nice, a more useful functionality is to maintain a certain number of jobs in the queue. The rocket launcher provides a “rapid-fire” mode that automatically provides this functionality. This mode requires another part of the QueueAdapter to be functioning properly, namely the part of the code that determines how many jobs are currently in the queue by the current user.
To test rapid-fire mode, try the following:
Tip
You don’t always have to copy over the JobParameters file. If you’d like, you can keep a single JobParameters file in some known location and just provide the full path to that file when running the rocket_launcher_run.py executable.
Try submitting several jobs using the command:
rocket_launcher_run.py --rapidfire -q 3 <JOB_PARAMETERS_FILE>
where the <JOB_PARAMETERS_FILE> points to your JobParameters file, e.g. job_params_pbs_nersc.yaml.
This method should have submitted 3 jobs to the queue at once, all inside of a directory beginning with the tag ‘block_‘.
You can maintain a certain number of jobs in the queue indefinitely by specifying the number of loops parameter to be much greater than 1:
rocket_launcher_run.py --rapidfire -q 3 -n 100 <JOB_PARAMETERS_FILE>
The script above should maintain 3 jobs in the queue for 100 loops of the rocket launcher. The rocket launcher will sleep for a user-adjustable time after each loop.
Tip
read the documentation of the rocket launcher for additional details.
If you’ve completed this tutorial, you’ve successfully set up a worker node that can communicate with the queueing system and submit either a single job or maintain multiple jobs in the queue.
However, so far the jobs have not been very dynamic. The same executable (the one specified in the JobParameters file) has been run for every single job. This is not very useful.
In the next part of the tutorial, we’ll set up a central workflow server and add some jobs to it. Then, we’ll come back to the workers and walk through how to dynamically run the jobs specified by the workflow server.