File systems

The main filesystem is 19 Petabytes of diskspace in /fred. This filesystem has more than 30 GB/s of aggregate bandwidth. It is a Lustre + ZFS filesystem with high performance, reliability, and redundancy.

There are a few other smaller filesystems in the cluster. Of these, /home is cluster-wide like /fred and $JOBFS is on SSDs in each compute node. /home is a Lustre + ZFS filesystem and $JOBFS is XFS.

Local disks

Each compute node has a local SSD (fast disk) that is accessible from batch jobs. If the I/O patterns in your workflow are inefficient on the usual cluster filesystems, then you should consider using these local disks.

The $JOBFS environment variable in each job points at a per-job directory on the local SSD. Space on local disks is requested from slurm with eg. #SBATCH --tmp=20GB. The $JOBFS directory is automatically created before each job starts, and deleted after the job ends.

A typical workflow that uses local disks would be to copy tar files from /fred to $JOBFS, untar, do processing on many small files using many IOPS, tar up the output, copy results back to fred.

Alternativley, you may use it to write a large number of small files during your job, which you then pack into a tarball and copy back to /fred at the end of your job (See optimizing writing).

Lustre

On OzSTAR, the two main cluster-wide directories are:

/home/<username>

and

/fred/<project_id>

/fred is vastly bigger and faster in (almost) every way than /home, but it is NOT backed up. /fred is meant for large data files and most jobs should be reading and writing to it.

Warning

Since /fred is not backed up, it is unsuitable for long term data storage. Data loss due to hardware failure is unlikely, but possible. Accidentally deleted files cannot be recovered.

The user is responsible for backing up any data that is important to an external location.

/home is backed up. /home is intended for small source files and important scripts and input.

Typically project leaders will create directories for each of their members inside /fred/<project_id> (e.g. /fred/<project_id>/<username>). If a user is a member of multiple projects then they will have access to multiple areas in /fred.

Quota

Disk quotas are enabled on OzSTAR.

/home has a per-user limit of 10GB blocks and 100,000 files. This cannot be increased. These numbers are set so that backups are feasible.

/fred has a default per-project limit of 10TB blocks and 1M files. If you require additional storage, please contact hpc-support@swin.edu.au

Note

Because of filesystem compression (see below), it is common to store more data than this and remain under quota. This is because quotas count only actual blocks used on disk.

To check your quota on any node, type:

quota

Transparent Compression

The /fred and /home filesystems have transparent compression turned on. This means that all files (regardless of type) are internally compressed by the filesystem before being stored on disk, and are automatically uncompressed by the filesystem as they are read.

Note

This means that you will not save diskspace if you gzip your files, because they are already compressed.

For example, a highly compressible file like a typical ASCII input file:

% ls -ls input
1 -rw-r--r-- 1 root root 39894 Mar  5 17:36 input

the file is ~40KB in size, but it uses less than 1KB on disk (the first column).

For a typical binary output file that is somewhat compressible:

% ls -lsh output
7.8M -rw-rw-r-- 1 root root 33M Feb 20 18:25 output

so 7.8MB of space on disk is used to store this 33MB file.

This means that although the nominal capacity of /fred is ~5 PiB (output from df is somewhat pessimistic), the amount of data that can be stored is greater. How much greater depends upon the compressibility of typical data.

Note

If you are transferring files over the network to other machines then it would still make sense to have your files compressed (gzip, bzip, xz etc.) in order to minimise both network bandwidth and diskspace used at the other end. But internally to the supercomputer there is no need to compress data. Doing so is fine, but unnecessary - it will just result in the data being slightly larger because it is compressed twice by slightly different algorithms.