Tutorials
Pete Tutorials
New Users
It is strongly recommended that new users look through the Pete User Manual. It contains useful information about using Pete and includes a step-by-step
tutorial for altering and transferring files, as well as submitting jobs to the queue.
An online version of the Pete tutorial can be found here.
A pdf of the Pete Users Manual can be obtained here: Pete User's Manual
TIGER Tutorials
All information regarding the use of the TIGER Research Cloud, including interfacing with Windows and Linux, and transferring files using Globus, can be found here.
Frequently asked Questions
Accounts
- Am I allowed to have an account on OSU HPCC resources?
If you are a student, faculty or staff member of Oklahoma State University you are entitled to an HPCC account, simply by filling out a HPCC Account Request form. Other researchers throughout the state of Oklahoma and those affiliated through various means with research/academic ventures underway at OSU are potentially welcome to an account also. If you have any questions about your eligibility for an account, please contact hpcc@okstate.edu with questions.
- I forgot my password. Can you tell me what it is?
We are not able to give you the current password on our systems, but we can reset your password and text the new temporary password to the mobile phone number on record. (NOTE: passwords are never emailed. For security purposes, they are only sent under separate copy to a mobile phone number or provided to you in person.)
- Why was my account deactivated or suspended? How can I reactivate it?
Your account may be suspended if you have not accessed HPCC resources for six months, or if it becomes necessary to protect the security of the HPC system. Please send an email to hpcc@okstate.edu, including your account name and issues experienced.
- Can I share my account with my colleague(s), student(s), friend(s), neighbor(s), etc?
Sharing of a personal account is not allowed on OSU or OSU HPCC resources.
- I failed several login attempts from an off-campus computer and I can no longer connect.
How can I regain access?
Please send an email to hpcc@okstate.edu and request your account be reactivated.
Storage
- How much space do I have in my /home/username directory?
Each user begins with a default 1GB quota on their /home/username directory, unless 10GB is requested. 10GB is the maximum size of allowed space in /home per user. There are other storage offerings available, please contact hpcc@okstate.edu for questions.
- How can I determine how much space I’ve used?
Type `du -sh ~` at the home directory prompt.
- Is my data guaranteed to be backed up?
The file systems are very reliable, however data can be lost or damaged due to media failures, software bugs, hardware failures or other problems. IT IS YOUR RESPONSIBILITY to back up critical files. If you need archival storage there is PetaStore available here in Oklahoma.
- How much space can I use in /scratch?
The /scratch space is not limited by quotas, but is a resource shared among all users. You should only store current work and files in your scratch directory. You may be asked to move or delete data when total scratch storage is high. If you need archival storage there is PetaStore available here in Oklahoma.
- I need to transfer numerous large files to/from another location. What is the best
way to do this?
There are several options for transferring large files outlined HERE.
Queuing/Scheduler
- What are the differences in the queues?
-
batch: The 'batch' queue is the default queue. The walltime limit is 120 hours (120:00:00).
-
express: The 'express' queue is for short jobs and debugging/testing scripts. The express queue has a walltime limit of one hour (1:00:00).
-
long: The 'long' queue is for long running jobs that are unable to use a checkpoint/restart feature. The walltime limit is 21 days (504 hours). Jobs in this queue are subject to being killed at the discretion of HPCC administrators for hardware and software issues.
-
bigmem: The 'bigmem' queue directs jobs to one of a dozen large memory nodes that each have 768 GB RAM. The walltime limit is 7 days, or 168 hours. (7-00:00:00 -or- 168:00:00).
-
supermem: The 'supermem' queue consists of one 1.5TB memory node. Users must have shown demonstrated need to use this queue, and should not attempt to use it unless HPC staff verified the 'bigmem' queue was not sufficient.
-
bullet: The 'bullet' queue consists of 10 nodes with dual NVIDIA QUADRO RTX6000, for a total of 20 GPUs. The walltime limit for this queue is 5 days, or 120 hours.
-
- Does it matter if I estimate my walltime accurately?
If you specify an excessively long runtime, your job may be delayed in the queue longer than it should be. The scheduler my allow your job to start ahead of certain MPI jobs if it has a short enough walltime; therefore, please attempt to accurately estimate your walltime!
- How can I extend the walltime for a job that’s already running?
Email hpcc@okstate.edu and request a walltime extension.
- Do I get a node to myself when I submit a job?
It depends. When your job runs on the node, no other users are able to submit jobs to the same node if all the cores are reserved by your job(s). However, if other users' jobs can fit on the node with your jobs, then they will share the resources. Note that the nodes have controls in place to keep multiple users' jobs from competing for memory/CPU resources.
- My job only needs to run for a few minutes, but it has been in the queue all day and
hasn't started yet! How can I get the job to start?
You should specify "#SBATCH -p express" in your submission script if your job only needs to run for a few minutes to an hour. The walltime must be one hour or less. New jobs in the express nodes usually start within a few seconds or a few hours. We will not extend walltime for jobs submitted to this queue, however.
- I’ve been submitting numerous jobs to the cluster over the last several weeks and
I’ve noticed my jobs are beginning to start behind other users’ jobs. Why is this
happening?
The scheduler keeps track of how much walltime is used by each user over the course of several weeks. The scheduler will begin to give lower priority to the 'heaviest' users who have submitted a lot of jobs. This help ensures new and less active users have an opportunity to utilize our free resources. All jobs will start normally if nodes are available and no other jobs are waiting in the queue.
- I’ve submitted hundreds of jobs to the cluster at once and many of them ahave not
started and the "Reason" is "QOSMaxCpuPerUserLimit". Do I need to kill those jobs
and resubmit them?
The scheduler limits the max number of jobs that a user can run at any given time. The excess jobs will be placed in the 'QOSMaxCpuPerUserLimit' sections until nodes become free. The jobs will eventually start on their own without need for intervention.
- Why do jobs with large amounts of cores seem to ‘cut in line’ to the front of the
queue?
The large core jobs are MPI jobs, which often require numerous nodes. MPI jobs have the ability to ‘partially reserve’ nodes that become free. The scheduler will still run, or “backfill,” some jobs onto these nodes while the MPI job is waiting for more resources to become available. A job will typically backfill as long as the specified walltime is short enough that it can start and finish before the MPI job is scheduled to have enough nodes. This functionality becomes more efficient when all users estimate walltimes accurately.
Software
- I need to use Software XYZ, can I install it myself?
We encourage all users to install their own software if possible. We expect users to work with the software developers, online tutorials, or external user forums when performing self-installs. Assisted software installations or any type of code or software troubleshooting is generally outside OSU HPCC's support realm; the software developer is likely a better source of information.
- I need to use Software XYZ, will you install it for me?
We are often backlogged with numerous requests, but there are some cases we may install open-source software that multiple users can/will use. Please email hpcc@okstate.edu with questions about software installs.
- Does graphical software run on the super computer?
Yes, but it is usually not ideal. The graphics are usually choppy or non-responsive. The TIGER research cloud is the appropriate system for graphical software, especially Windows-based.
Please contact hpcc@okstate.edu for any other system or support related questions.
HPC Educational Resources
Supercomputing
-
XSEDE Online Training—XSEDE requires creation of a free account.
Linux
Editing Files
-
Nano—We recommended Nano for first-time users. We also recommend using the -w option when starting nano: nano -w.
-
Emacs—(M-(char) means first press the Esc key, then press (char); C-(char) means hold down the Ctrl key and hit (char)).
Scripting
Parallel Programming
Computational Science Education