Getting started with HPC

This guide will give you a short overview over the most important aspects of running applications on the HPC systems. More more in-depth information, please refer to the linked documentation.

If you have any questions that you like to ask in person, you can come to our HPC Café. On every second Tuesday of the month, we offer a short talk about a specific topic and you can also get a hands-on introduction to our systems. And of course there is coffee and cake!

This guide assumes that you already have an HPC- account. If this is not the case, you can get the application form here. Basic usage of the HPC systems typically is free of charge for FAU researchers for publicly funded research. If you have any questions regarding the application, please contact your local RRZE contact person or the HPC-support.

You need to fill out the application form, print it, sign it and let it be stamped with the Chair or Institute seal. Once it is ready, you can bring it by the RRZE Service Desk or send it via Email, fax, or internal mail.

Fill out the following meta data fields if applying for the first time:

For the System requirement section, a rough estimate of the resources you expect to need is sufficient. Just tell us on how many nodes/CPUs your simulations will need (typical jobsize), the expected runtime per simulation and the approximate number of simulation runs you are planning (overall requested computation time).  If you are unsure or need help do not hesitate to contact us. Also note that you have to provide an expiration date, usually the duration a your specific project. The duration can also be coupled to your affiliation (employee, PhD student, etc.).

For the lower part of the form, the following information is required:

 

Please note that the more detailed project information is only necessary in case a new RRZE customer ID is requested.

HPC Clusters

By default, all clusters use Linux operating systems with text-mode  only. Basic knowledge of file handling, scripting, editing, etc. under Linux is therefore required.

RRZE operates divers HPC systems which are tailored to different use cases. Thus, choice of the appropriate cluster always is essential even if your account will work on most of the systems:

  • single-core or single node (throughput) jobs: Woody and/or TinyEth
  • multi-node MPI-parallel jobs: Emmy (and Meggie)
    access to Meggie is restricted to projects which already proofed efficient resource usage — thus it’s not a system for starters
  • GPU jobs: TinyGPU or Emmy
    most of the nodes in TinyGPU have been financed by individual groups; therefore, access restrictions / throttling policies may apply.
  • large main memory requirement: TinyFat
    the modern Broadwell-based nodes have been financed by an individual group; therefore, access restrictions / throttling policies may apply.

Also see the table on the main HPC page. If you’re unsure about which systems to use, feel free to contact the HPC group.

Connecting to HPC systems

To log into the HPC front ends, you have to connect via a SSH (SecureShell) client. Windows users can either use the Linux subsystem included in recent Windows 10 versions or a third-party client like for example PuTTY or MobaXterm. Under Linux and Mac, native OpenSSH functionality is available. From within the university network, you can connect using the following command:

ssh USERNAME@CLUSTERNAME.rrze.fau.de

In this case, USERNAME is your HPC user name and CLUSTERNAME is the name of the cluster you want to log into, e.g. woody or emmy. If you want to access TinyFat, TinyGPU or TinyEth, you also have to connect to woody.

If you want to access the clusters from outside the university network, you have to connect to the dialogserver first :

ssh USERNAME@cshpc.rrze.fau.de

You can then ssh to the cluster front ends from there. As an alternative, you can also use VPN to access the clusters directly.

Working with data

Different file systems are accessible from the clusters. Due to their different properties, some might be more suited for the required task than others. The first three classes of directories are available on all HPC systems:

  • $HOME: standard home directory at login, available under /home/hpc
    • small quota (10 GB) – cannot be increased
    • backup: regular, additional fine-grained snapshots
    • storage of important files only
  • $WORK:General purpose work directory
    The recommended work directory is $WORK. Its destination may point to different file servers and file systems:

    • $WOODYHOME: available under /home/woody
      • standard quota 200GB
      • no backup
      • can be used for input/output files and for small files
    • $SATURNHOME: available under /home/saturn or /home/titan, both are for share holders only!
      • group quota according to payment (typically 25+ TB)
      • no backup
      • can be used for input/output files and for small  files
  • HSM file system $HPCVAULT: available under /home/vault
    • standard quota 100 GB for online-files and quota on the number of files/directories
    • backup: regular, additional snapshots
    • mid- to long term storage of large files; these files may transparently be migrated to offline tape
  • Parallel file systems $FASTTMP:
    • local to emmy/meggie, cannot be accessed from outside these systems
    • no backup, no quota for data volume, but high watermark deletion and limits on the number of files/directories
    • short term storage, only for high performance parallel I/O, no ASCII files

For all filesystems your personal folder is located in your group directory, for example for $HOME at /home/hpc/GROUPNAME/USERNAME. You can also use the environment variables to access the folders directly.

File system quota

Nearly all file systems impose quotas on the data volume and/or the number of files or directories. These quotas may be set per user or per group. There is a distinction between  hard quota, which is the absolute upper limit which cannot be exceeded, and soft quota, which can be exceeded temporarily for a certain grace period (7 days). After that time, it turns into a hard quota. You will be notified automatically if you exceed your personal quota on any file systems. You can look up your used quota by either typing quota -s or shownicerquota.pl on any cluster front end.

Share holders can lookup information on their group quota on $SATURNHOME in text files available as /home/{saturn,titan}/quota/GROUPNAME.txt.

Data transfer

Under Linux and Mac, scp and rsync are the preferred ways to copy data from and to a remote machine. Under Windows, either the Linux subsystem or additional tools like WinSCP can be used.

Available Software

The standard Linux packages are installed on the cluster front ends. On the compute nodes, usually much less software is available.

The majority of software is provided by RRZE via the modules system. It contains a variety of compilers, libraries, open and commercial software. A module has to be loaded explicitly to become usable. All module commands affect the current shell only. The available modules may differ between the clusters.

The available modules can be listed via module avail. Module are loaded via module load <modulename> and unloaded via module unload <modulename>. The currently loaded modules are displayed by module list.  The module commands can usually be used unmodified in any type of PBS job script.

Some modules cannot be loaded together. In some cases such a conflict is detected automatically during the load command, in which case an error message is printed and no modifications are made. Modules can depend on other modules, so that these are loaded automatically when you load the module. As an example, the current Intel compiler modules will depend on IntelMPI and Intel MKL which are loaded automatically:

 
$ module load intel64
$ module list
  Currently Loaded Modulefiles:
  1) intelmpi/2017up04-intel 2) mkl/2017up05 3) intel64/17.0up05

 

Compiling parallel applications

For compiling your MPI parallel application, you have to explicitly load the necessary modules. For example when using the Intel compiler and Intel MPI,  just use module load intel64. When gcc should be used,  use module load gcc to get the default version of the compiler. In this case, you have to manually load the desired MPI module.

You can then use the wrapper commands mpicc , mpiCC , mpif77 , or mpif90 to compile your MPI source code.  Prior to running your code, you have to load the same modules as for compiling the program.

More details on running parallel applications can be found here.

Running Jobs

The cluster front ends can be used for for interactive work like editing input files or compiling your application. The amount of time each of your applications is running is restricted by system limits, e.g., after 1 hour of CPU time your run will be killed. Front ends are shared among all users, so be considerate which applications you run. Please do not run applications with large computational or memory requirements on the front ends, since this may interfere with the work of other users. MPI parallel jobs are generally not allowed on front ends at RRZE.

Batch system

Compute nodes cannot be accessed directly. Compute resources have to be requested by a resource manager software, the so-called batch system. All user jobs except short serial test runs must be submitted to the cluster through this batch system. This is done by creating a job script, that contains all the commands you want to run and also the requested resources like number of compute nodes and runtime. The submitted jobs are  routed into a number of queues (depending on the needed resources, e.g. runtime) and sorted according to some priority scheme. A job will run when the required resources become available. The output of the job is written into a file in your submit directory.

The older clusters use a software called Torque as the batch system, newer clusters starting with meggie instead use Slurm. Sadly, there are many differences between those two systems. Please refer to the linked documentation of the two batch systems for details on the required commands and example scripts.

It is also possible to submit interactive batch jobs that, when started, open a shell on one of the assigned compute nodes and let you run interactive (including X11) programs there. This is especially useful for testing or applications which cannot be run on the front ends due to higher computational requirements.

Job status

The current status of the clusters can be found here. It also includes a list of running  and queued jobs.  This information can be useful to assess the current workload of the cluster, which also influences the queuing time of your job.  It will also show the Message of the day (MOTD) for each cluster, where changes in the configuration, maintenance times, and other disruptions in service will be announced. The MOTD is also visible when you log into a cluster front end.

Good practices

Try to use the appropriate amount of parallelism. Since most workloads are not highly scalable, it is not always better to use more cores for your application. It can be beneficial to run scaling experiments to figure out the “sweet spot” of your application.

Check the results of your job regularly to prevent waste of computational resources. You can also check if your job actually uses the allocated nodes in the intended way and if it runs with the expected performance. On meggie and emmy, it is also possible to access performance data of your finished jobs, including e.g. memory used, floating point rate and usage of the (parallel) file system. To review this information here, you need a job specific AccessKey, which can be found in the output file.

Use the appropriate file system for your calculations.  Doing tiny-size, high-frequency  I/O on a parallel file may overload the metadata servers. When data becomes obsolete, delete it, especially on the parallel file systems ($FASTTMP). No quota limitations apply there, but if a certain level is reached, a high-watermark deletion will be executed, which will affect old files of all users. Data which should be archived should be moved to $HPCVAULT.

If you have a problem with your application that you cannot solve yourself, report it to the HPC-support using your FAU mail address. This will immediately open a helpdesk ticket and someone will get back to you. Please provide as much detail as possible so we know where to look, including user name, cluster name, jobID, file system, time of event, etc..