Skip to end of metadata
Go to start of metadata

This Quick Reference sheet is designed for people already familiar with the HPCC system at MSU. It has been specifically designed to focus on settings specific to MSU HPCC and help users who use many different HPC systems keep track of the differences. Users also find the following guide useful to print out and put on their desk:

20131220_userguide.pdf

 

 

 

HPCC Related URLs

Topic

URL

Regular SSH Login

gateway.hpcc.msu.edu

Eval-Node regular Login

eval.hpcc.msu.edu

Remote Desktop Gateway (from campus-only)

rdp.hpcc.msu.edu

wiki

http://wiki.hpcc.msu.edu

Contact/help form

http://www.hpcc.msu.edu/contact

System Monitoring

https://wiki.hpcc.msu.edu/display/stats

Public Website

http://www.hpcc.msu.edu

 

 

Developer Nodes

From gateway (ssh gateway.hpcc.msu.edu) you can log into a developer node using (ssh "Node Name"):

Node Name

Cpus

Memory

Notes

dev-intel1420256GBLarge memory intel14 node
dev-intel14-k2020128GBTwo Nvidia K20 GPUs
dev-intel14-phi20128GBTwo Xeon Phi accelerators
dev-intel1628128GBTwo 2.4Ghz 14-core Intel Xeon E5-2680v4
dev-intel16-k8028256GBIntel16 node with 4 Nvidia Tesla K80 GPUs

Cluster Hardware

The following hardware is available on the hpcc main cluster.

Name

Total available
nodes

Maximum ppn
(cores)

Total available
Cores

Max Memory
per node

Max Local
File Disk (shared)

Node Name
Prefix

intel11

5

32-64

160

0.5-2TB

240GB

ifi-###

intel1422020 (plus accelarators)4400250GB450GBcss-###, csm-###, csn-###, csp-###
intel14-xl548-962881 TB - 6 TB800 GBqml-###
intel1632028(cpu)8960

128 GB (x290)

256 GB (x24)

512 GB (x6)

240GBlac-###
intel16-k8050

28(cpu)

19968 stream processors

1400256GB240GBlac-###

File Spaces

Path

Description

/mnt/home/$USER

User home directory

/mnt/research/$GROUP

Research home directory

/mnt/ls15/scratch/users/$USER

or

/mnt/ls15/scratch/groups/$GROUP

Scratch space for fast shared file access.

$TMPDIR

Temporary local scratch space allocated to each job.

General Script options:

example.qsub
#!/bin/bash --login
# Time job will take to execute (HH:MM:SS format)
#PBS -l walltime=00:00:10
# Memory needed by the job
#PBS -l mem=10mb
# Number of shared memory nodes required and the number of processors per node
#PBS -l nodes=3:ppn=2
# Make output and error files the same file
#PBS -j oe
# Send an email when a job is aborted, begins or ends
#PBS -m abe
# Give the job a name
#PBS -N jobname
# Request temporary local disk space for this job
#PBS -l file=10gb
# Request a job array
#PBS -t 1-2


cd ${PBS_O_WORKDIR}       # Change to the Original Working Directory


cat ${PBS_NODEFILE}       # Output Contents of the PBS NODEFILE


env | grep PBS            # Print out values of the current jobs PBS environment variables



qstat -f ${PBS_JOBID}     # Print out final statistics about resource uses before job exits

Common Scheduler Commands:

Command

Description

qsub <scriptname>

Submit a submission script to the scheduler.

showq -u $USER

Show the jobs of the current user.

qdel <jobid>

Delete a job from the queue.

qdel $(qselect -u username)

Delete all jobs of a particular user from the queue.

showstats

See the current system utilization status.

checkjob -v <jobid>

Check the details of a particular job.

showstart -e all <jobid>

Show estimated start times for a job.

Common Module Commands:

Command

Description

module avail

Show currently available modules.

module list

List currently loaded modules.

module show <module name>

Show what is changed by a module.

module unload <module name>

Unload a currently loaded module.

module load <module name>

Load an available module.

Commonly used PBS environment variables

Variable Name

Description

HOST

Name of the computer currently running the script.This should be one of the nodes listed in the PBS_NODEFILE

USER

User Name (NetID). Useful if you would like to dynamically generate a directory on some scratch space.

PBS_O_WORKDIR

Directory where the qsub command was executed. Useful with the cd (change directory) command to change your current directory to your working directory.

TMPDIR

Local temporary disk storage unique to each node and each job. This directory is automatically created at the begining of the job and deleted at the end of the job.

PBS_JOBID

Job ID number given to this job. This number is used by many of the job monitoring programs such as qstat, showstart, and dque.

PBS_JOBNAME

Name of the job. This can be set using the -N option in the PBS script (or from the command line). The default job name is the name of the PBS script.

PBS_NODEFILE

Name of the file that contains a list of the HOSTS provided for the job.

PBS_ARRAYID

Array ID numbers for jobs submitted with the -t flag. For example a job submitted with #PBS -t 1-8 will run eight identical copies of the shell script. The value of the PBS_ARRAYID will be a an integer between 1 and 8.

PBS_VNODENUM

Used with pbsdsh to determine the task number of each processor. For more information see http://www.ep.ph.bham.ac.uk/general/support/torquepbsdsh.html.

PBS_O_PATH

Original PBS path. Used with pbsdsh.

PBS_NUM_PPN

Number of Processors Per Node requested by the current job (useful for hybrid code).

PBS_NPTotal number of cores requested (Nodes*PPN)
PBS_O_HOSTCurrent Host of PBS job
PBS_NUM_NODESNumber of requested nodes

Getting Help

First Use the search function of this wiki or explore the table of contents on the left (each section will expand when clicked).  Second you may  complete our contact form to ask a question https://contact.icer.msu.edu , or Finally visit us during our open office hours Mondays 1-2pm or Thursdays 1-2pm in the MSU Biomedical & Physical Sciences Building, Room 1455A.   For information on current state of the system see https://icer.msu.edu/service-status