Am I allowed to have an account on OSU HPCC resources? If you are a student, faculty or staff member of Oklahoma State University you are entitled to an HPCC account, just by filling out a HPCC Account Request form. Other researchers throughout the state of Oklahoma and those affiliated through various means with research/academic ventures underway at OSU are welcome to an account also. If you have any question about your eligibility for an account, please contact email@example.com with questions.
I forgot my password. Can you tell me what it is? We are not able to give you the current password on Cowboy, but can reset your password and text the new temporary password to the mobile phone number on record. (**NOTE** passwords are never emailed. For security purposes, they are only sent under separate copy to a mobile phone number or provided to you in person.)
Why was my account deactivated or suspended? How can I reactivate it? Your account may be suspended if you have not accessed HPCC resources for six months or if it becomes necessary to protect the security of the HPC system. Please send an email to firstname.lastname@example.org or stop by our offices in Math Sciences (offices 105 - 107a) requesting reinstatement of your account.
Can I share my account with my colleague(s), student(s), friend(s), neighbor(s), etc? Sharing of a personal account is not allowed on HPCC resources. We can set up group accounts that allow multiple personal accounts to share documents. Please ensure each member has a personal HPCC account by filling out a HPCC Account Request form, then send an email to email@example.com and request a group be set up (this request should list the users authorized to be added to the group).
I failed several login attempts from an off-campus computer and I can no longer connect. How can I gain regain access? Please send an email to firstname.lastname@example.org and request your account be reactivated.
How much space do I have in my /home/username directory? Each user has 25GB quota on their /home/username directory.
How can I determine how much space I’ve used? Type du -sh at the home directory prompt.
I’ve run out of space in my /home/username directory! Can I have more space? This directory should only have source code and executables stored on it. Large files and collections of files should be moved to your /scratch/<username> directory. If this still requires that you have more space please contact email@example.com, or stop by the offices in Math Sciences (105 – 107a) to discuss alternatives.
Is my data guaranteed to be backed up? The file systems are very reliable, however data can be lost or damaged due to media failures, software bugs, hardware failures or other problems. IT IS YOUR RESPONSIBILITY to back up critical files. If you need archival storage there is PetaStore available here in Oklahoma.
How much space can I use in /scratch? The /scratch space is not limited by quotas, but is a resource shared among all users. You should only store current work and files in your scratch directory. You may be asked to move or delete data when total scratch storage is high. If you need archival storage there is PetaStore available here in Oklahoma.
I need to transfer numerous large files to/from another location. What is the best way to do this? There are several options for transferring large files outlined HERE.
What are the differences in the queues?
batch: The 'batch' queue is the default queue. The walltime limit is 120 hours (120:00:00).
express: The 'express' queue is for short jobs and debugging/testing scripts. The express queue contains 2 compute nodes and has a walltime limit of one hour (1:00:00).
bigmem: The 'bigmem' queue directs jobs to one of the two large memory nodes that each have 256 GB RAM and an NVIDIA Tesla C2075 GPU card. The walltime limit is 120 hours (120:00:00).
killable: The 'killable' queue is for long running jobs that are unable to use a checkpoint/restart feature. The walltime limit is 504 hours (504:00:00). Jobs in this queue are subject to being killed at the discretion of HPCC administrators for hardware and software issues.
Does it matter if I estimate my walltime accurately? If you specify an excessively long runtime, your job may be delayed in the queue longer than it should be. The scheduler my allow your job to start ahead of certain MPI jobs if it has a short enough waltime; therefore, please attempt to accurately estimate your walltime!
How can I extend the walltime for a job that’s already running? Email firstname.lastname@example.org and request a walltime extension.
Why is my job status ‘deferred’ and/or ‘blocked’? Jobs become 'deferred' or 'blocked' for a variety of reasons. Major errors in submit scripts, such as requesting more cores than any node physically has available, will cause the scheduler to indefinitely defer jobs. Those jobs would need to be killed and resubmitted after corrections are made to the script. Occasionally, jobs become deferred when not enough nodes are available to run them: the scheduler will typically correct this problem over time. Please email email@example.com for any concerns about why jobs are deferred or blocked.
Do I get a node to myself when I submit a job? Yes. When your job runs on the node, no other users are able to submit jobs to the same node. This prevents resource contention between multiple users on the same node. For example, if two users are running jobs on the same node and one user maxes out the memory, then that node will freeze; causing the other user to lose all his/her work and time. Thus, we live by a 'one user per compute node' policy at OSU HPCC.
My job only needs to run for a few minutes, but it has been in the queue all day and hasn’t started yet! How can I get the job to start? You should specify "#PBS -q express" in your submission script if your job only needs to run for a few minutes to an hour. The walltme must be one hour or less. Two nodes are assigned solely to the express queue to help process short jobs. New jobs in the express nodes usually start within a few seconds or a few hours. Jobs in the other queues may take longer, depending on the number of other jobs running and waiting.
I’ve been submitting numerous jobs to the cluster over the last several weeks and I’ve noticed my jobs are beginning to start behind other users’ jobs. Why is this happening? The scheduler keeps track of how much walltime is used by each user over the course of several weeks. The scheduler will begin to give lower priority to the 'heaviest' users who have submitted a lot of jobs. This help ensures new and less active users have an opportunity to utilize our free resources. All jobs will start normally if nodes are available and no other jobs are waiting in the queue.
I’ve submitted hundreds of jobs to the cluster at once and many of them are in the “Blocked Jobs” section! Do I need to kill those jobs and resubmit them? The scheduler limits the max number of jobs that a user can run at any given time. The excess jobs will be placed in the 'Blocked Jobs' sections until nodes become free. The jobs will eventually start on their own without need for intervention.
Why do jobs with large amounts of cores seem to ‘cut in line’ to the front of the queue? The large core jobs are MPI jobs, which often require numerous nodes. MPI jobs have the ability to ‘partially reserve’ nodes that become free. The scheduler will still run, or “backfill,” some jobs onto these nodes while the MPI job is waiting for more resources to become available. A job will typically backfill as long as the specified walltime is short enough that it can start and finish before the MPI job is scheduled to have enough nodes. This functionality becomes more efficient when all users estimate walltimes accurately.
I need to use Software XYZ, can I install it myself? You are welcome to install software in your home or scratch directory. There is usually an option to configure scripts such as --prefix=/home/username/path that directs the program to install in your preferred location.
I need to use Software XYZ, will you install it for me? Please email firstname.lastname@example.org and let us know the software, version and other specifics that you are requesting installed.
Does graphical software run on the super computer? Yes, but it is usually not ideal. The graphics are usually choppy or non-responsive. Please email email@example.com for options including our virtual servers within the TIGER research cloud.
NEED HELP? firstname.lastname@example.org