Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Logging In

Code Block
languagebash
ssh <username>@login.hpc.uams.edu

...

You are now on the HPC login node. From here you can stage your data and jobs to be submitted to the computational nodes in the cluster. You can view the current load of the overall system from the login node with the showstate showq command.

Anchor
SRun
SRun
Submit a Simple Job

While the login node is a relatively powerful server, it should not be used to do any actual work, as that could impede others ability to use the system. We use Torque Slurm to manage jobs and resources on the cluster. The qsub program srun and sbatch programs will be your primary interface for submitting jobs to the cluster. In its simplest form you can feed it a command on standard input and it will schedule and run a job. Here we will schedule a single command lscpu to run using all of the defaults

Code Block
languagebash
echosrun lscpu | qsub

 The output from jobs will end up in your home directory as they run. Since we didn't name the job we just submitted you will find a file called STDIN.o##JOBID##, which will contain the standard output of the lscpu command once it has finished running on the node it was assigned.

Code Block
languagebash
less STDIN.o########

this job will print to directly to your terminal. This can be useful for very simple commands or testing, however normally you will submit more complex jobs as a batch file.

Anchor
SBatch
SBatch
Submit a Scripted Job

The qsub sbatch program takes many arguments to control where the job will be scheduled and can be fed a script of commands and arguments to be run instead of just feeding them in through a pipe. We will now create a script which will both contain the arguments and actual commands to be run.

...

Code Block
languagebash
#!/bin/bash
#PBS#SBATCH -M -mail-user=<YOUR_EMAIL>@uams.edu          #<---- Email address to notify
#PBS#SBATCH -m abe--mail-type=ALL                             #<---- Status to notify the email (abort,begin,end)
#PHS -N 
#SBATCH --job-name=CPUinfo                        #<---- Name of this job
#PBS -j oe                             #<---- Join both the standard out and standard error streams
                                       			 #<---- Commands below this point will be run on the assigned node
echo "Hello HPC"
lscpu
echo "Goodbye HPC"

Once this script is created it can be run by passing it to the qsub sbatch program. After this job has finished there will now be a file named cpuinfoslurm-#####.o###### out in your home directory which will contain the output.

Code Block
languagebash
qsubsbatch cpuinfo.script

When submitting a script you can also pass arguments on the command line to qsub sbatch. Here we submit the lscpu script again, except this time we ask for a node with a xeon processor. Compare the outputs of the two jobs, or experiment with different features constraints that can be requested.

Code Block
languagebash
qsubsbatch -l feature-constraint=xeon cpuinfo.script

Monitoring Jobs

Jobs so far have been quick to run, often though you will want to monitor longer running jobs. Remember that the showstate showq program will display the state of the entire cluster. There are many other programs which can help you monitor your own state and jobs.

Code Block
languagebash
qstat

This program will display your current and recent jobs submitted to the cluster. The S column contains the current status of your jobs.

Code Block
languagebash
qstat -fsqueue


This option will print the full status of current jobs and is useful for finding the exec_host of a running job. Knowing the host will allow you to peek in a few ways at what the node is currently doing.

...

Code Block
languagebash
pdsh -w <nodename> free -h
pdsh -w <nodename> uptime
pdsh -w <nodename> top -b -n1

Installing Software

The HPC has some software packages already installed, however they will need to be activated using Lmod. You can browse avaliable available modules or search for them and see descriptions with these commands.

...

 One of the most useful modules is EasyBuild. This is a build and installation framework designed for HPCs. Many scientific toolsets tool sets can be installed using it, once they are, they can be activated using the module commands above. However, EasyBuild will always have to be loaded first, before anything installed with it can be loaded, the module spider <search> command will explain this if you forget.

...

languagebash

...