This tutorial will guide you through FireWorks installation on worker nodes, test communication with the queue, and submit some placeholder jobs.
To install FireWorks on the worker, please follow the instructions listed at Installation on a machine.
After installation, the script rocket_launcher_run.py should have been added to your system path. The rocket launcher creates directories on your file system to contain your runs, and also submits jobs to the queue.
You should now be able to run the rocket launcher command as an executable. Type this command into your command prompt (from any directory) to ensure that the script is found:
rocket_launcher_run.py -h
This command should print out more detailed help about the rocket launcher. Take a minute to read it over; it might not all be clear, but we’ll step through some of the rocket launcher features next.
We now want to test interaction of our worker computer with the queue.
Important
If your queuing system does not have built-in support, do not despair! You might try to use PBSAdapterNERSC as a reference for coding your own PBS adapter. You can also look at the docs for QueueAdapterBase to get an idea for what needs to be implemented. In most cases, adding a new QueueAdapterBase implementation should not be difficult. If you encounter trouble, you can contact us for help (see Contributing).
Important
The specific format of the JobParameters file might vary with the QueueAdapterBase, as each queuing system might require slightly different parameters. Refer to the documentation of your QueueAdapterBase implementation to see how to correctly structure a JobParameters file. For example, the PBSAdapterNERSC accepts many more parameters than those listed in the testing JobParameters file.
Tip
For more about the JobParameters file, look at the JobParameters class.
Important
Ensure that the ‘exe’ parameter in the JobParameters file reads: “echo ‘howdy, your job launched successfully!’ >> howdy.txt”
Important
Make sure the qa_name parameter in the JobParameters file indicates the name of your desired queue adapter.
Try submitting a job using the command:
rocket_launcher_run.py <JOB_PARAMETERS_FILE>
where the <JOB_PARAMETERS_FILE> points to your JobParameters file, e.g. job_params_pbs_nersc.yaml.
If you finished this part of the tutorial successfully, congratulations! You’ve successfully set up a worker node to run FireWorks. You can now continue to test launching jobs in a “rapid-fire” mode.
While launching a single job is nice, a more useful functionality is to maintain a certain number of jobs in the queue. The rocket launcher provides a “rapid-fire” mode that automatically provides this functionality. This mode requires another part of the QueueAdapterBase to be functioning properly, namely the part of the code that determines how many jobs are currently in the queue by the current user.
To test rapid-fire mode, try the following:
Tip
You don’t always have to copy over the JobParameters file. If you’d like, you can keep a single JobParameters file in some known location and just provide the full path to that file when running the rocket_launcher_run.py executable.
Try submitting several jobs using the command:
rocket_launcher_run.py --rapidfire -q 3 <JOB_PARAMETERS_FILE>
where the <JOB_PARAMETERS_FILE> points to your JobParameters file, e.g. job_params_pbs_nersc.yaml.
This method should have submitted 3 jobs to the queue at once, all inside of a directory beginning with the tag ‘block_‘.
You can maintain a certain number of jobs in the queue indefinitely by specifying the number of loops parameter to be much greater than 1 (e.g., the example below sets 100 loops):
rocket_launcher_run.py --rapidfire -q 3 -n 100 <JOB_PARAMETERS_FILE>
The script above should maintain 3 jobs in the queue for 100 loops of the rocket launcher. The rocket launcher will sleep for a user-adjustable time after each loop.
Tip
the documentation of the rocket launcher contains additional details, as well as the built-in help file obtained by running the rocket launcher with the -h option.
If you’ve completed this tutorial, you’ve successfully set up a worker node that can communicate with the queueing system and submit either a single job or maintain multiple jobs in the queue.
However, so far the jobs have not been very dynamic. The same executable (the one specified in the JobParameters file) has been run for every single job. This is not very useful.
In the next part of the tutorial, we’ll set up a central workflow server and add some jobs to it. Then, we’ll come back to the workers and walk through how to dynamically run the jobs specified by the workflow server.