Frequently asked questions

My Account

How can I get time on Melbourne Bioinformatics (formerly VLSCI) clusters?

Please use our application form.

How do I see how much disk space I have used, or have left?

Use the command mydisk to see how much is used and left in your home directory. If you are in several projects, all are listed.

I cannot find the application or binary I need, but I believe it is installed.

Melbourne Bioinformatics uses a tool called modules to set up your environment for any particular application. While we arrange for some to be loaded by default, you probably need to choose to load the ones you want to use. You can see a full list of applications supported by modules by typing: module avail

You can then choose to load or unload the default version of one of those modules. For example, to load gcc, use: module load gcc

You can pick a specific version of the software by specifying the full name. For example: module load gcc/4.4.3

The software I need to use has some license restrictions and I find I cannot use it.

Some applications installed on the Melbourne Bioinformatics systems are subject to license restrictions. These restrictions may limit who can use them (eg. academic use only) or require you to cite the application when publishing work that involved use of the application. When you sign your personal account application form, you agree to comply with any such restrictions. Melbourne Bioinformatics keeps a register of specific applications and who has formally agreed to the terms that relate to that application. Please send an email to the help desk to be added to that register. You need to state clearly that you have read the license restrictions and agree to comply with them.

Unfortunately, the restrictions associated with some applications prevent some people from being allowed to use them at all.

I am in two (or more) different projects. How do I handle that?

Melbourne Bioinformatics resources are made available to projects rather than to individuals. Normally, your personal home directory sits within the project disk space. However, if you are associated with more than one project, that home directory can clearly only exist within one of the project spaces. This means that before you login you need to let the system know which project you want to use. All usage and permissions will be set-up based on which project the system thinks you are logged in under.

To change your default project, login to the Melbourne Bioinformatics project management site select My Projects and click Make default on the project which you want to login under. You need to do this the first time you login to a new project so a new home directory will be set-up for you.

For example, assume that you are in two projects, VR5200 and VR9999 and your user name is jsmith. Your home directory is probably /vlsci/VR5200 but you are now working on VR9999. When you logon, you will find (using the pwd command) that you are in /vlsci/VR5200/jsmith. If you save work from VR9999 in this location, your disk usage will be taken from VR5200 and any queued jobs will also be charged to your default project.

To avoid allocating disk and CPU usage to the wrong account, you can use the online tool to change your default project before logging in. It is possible to work with multiple projects without changing your default project, but you will need to be careful with group permissions and keeping track of which project you are charging jobs to. The safest way to launch jobs is to explicitly specify which project to charge. For instance, when you launch jobs on the cluster, add the command #SBATCH -A VR5201 so the CPU usage is assigned to the correct project.

Why do I not have a home directory?!

There are two situations where you might be asking this question.

You're a new user.

You've just created a Melbourne Bioinformatics account. You want to upload some files (eg. using WinSCP or another file transfer application) that you'll subsequently use in your compute jobs. But when you attempt to upload them you can't find a home directory for your account and thus there is nowhere to upload your files.

When Melbourne Bioinformaitcs accounts are created, a corresponding home directory is not created immediately. The home directory is created upon your first login to a Melbourne Bioinformatics system using ssh. On Mac or Linux you can connect by typing the following into any terminal application:

ssh <yourusername>@barcoo.melbournebioinformatics.org.au (or snowy.melbournebioinformatics.org.au)

On Windows, you can use Putty to connect via ssh.

During your first login via ssh, you will see the following message:

Could not chdir to home directory /vlsci/<yourproject>/<yourusername>: No such file or directory
Home directory created

The first line shows that your shell initialisation scripts couldn't find your home directory; and the second shows that it was subsequently created. Once this is done, you may exit your ssh session (type exit) and then resume uploading files.

The second possible scenario:

You've joined an additional project.

You've already been a member of one project (say, PROJ1) and have a home directory at /vlsci/PROJ1/<yourusername>. You've now joined a new project (PROJ2), but you don't find a home directory for yourself at /vlsci/PROJ2/<yourusername>, even after logging in via ssh. Where is your home directory for PROJ2?

On our systems, your home directory is considered to be inside your default project. When you are in only one project, that project must be the default. Once you join a second (or subsequent) project, you must then decide which you wish to be your default.

To check or adjust your default project: Go to https://my.vlsci.org.au/karaage/profile/accounts/ Click your username in the "Account" column * At the bottom of the page your projects are listed. Click the "Make Default" on the row of the project which is to be your default.

Once you set a project to be your default, at your next login, your home directory will be considered to be inside that project. For example if you set your default project to be PROJ2, and there is no home directory at /vlsci/PROJ2/<yourusername>, then at your next login this directory will be created.

Note that if you change your default project and thus your home directory location, you won't see any settings you have stored in your previous home directory, for example your personal .bashrc, .bash_profile, your .ssh/config or your .vimrc. You may want to copy whichever of these files is important to you from your previous home directory to your new one so you can keep the settings you've configured.

See also changing your default project.

Warning: in some scenarios you may have queued jobs which reference paths inside your home directory. If you change your default project and thus your home directory before these jobs begin to run, you may see unexpected results.

Running jobs under the scheduler

My job is taking longer than I expected. It might run out of walltime.

Jobs that run out of walltime will be automatically killed by the scheduler. As soon as you think this might happen, send a message to help@melbournebioinformatics.org.au telling us the job number and how much extra time it might need. We can sometimes extend the walltime and will do so if we can.

My job won't run, what is the problem?

There are a number of potential problems.

Please don't hesitate to ask the help desk what is going on. It is quite possible that there is a problem that’s easily fixed if you bring it to our attention or we can suggest an alternative, and possibly more productive way to run your jobs.

Data and file space

Is my data backed up?

Data stored on our systems is backed up. However we strongly recommend that you keep your own backups of important data. The storage system uses RAID and has redundant disks but that may not be enough if we are unlucky. However, Melbourne Bioinformatics may be able to recover files you have accidentally deleted recently. Please contact the help desk as soon as you realise you need to recover a deleted file. Note that this is not the same as a backup protecting against systems failure. If Melbourne Bioinformatics has an unlikely combination of disk failures, your files could be lost if you don't have your own backup system in place.

I need a large data set for my research. How might that work?

Melbourne Bioinformatics may be able to download and maintain the data set for you. In many cases, several users may need the same data and it is clearly a good idea to have just one copy to conserve disk space and network bandwidth. The assumption here is that the data concerned is publicly available and you need only read access. Please contact the help desk.

My project has a data set that all project members need to use. Where should I put it?

Just above your home directory, you will see the home directories of other members of your project. You will also see a directory called shared at that level. All your project members will be able to write into the shared directory and files written there will, by default, be set so that other project members can use them. Space taken up by files in this shared directory contributes to your project's total allowed disk space usage.

Note that files you create under the shared directory will be readable by other members of your group but not, by default, writable. You can change their mode with the chmod command. For example, to ensure other group members can write to a file called myfile, you would use chmod g+w myfile. You can also alter the default mode that newly created files get using umask. Generally, these commands need be treated somewhat carefully!

How do I see how much disk space I have available?

At Melbourne Bioinformatics, disk space is granted to projects so you share the disk space with other members of your project. Each project has a disk volume and you can find the status of that volume with the mydisk command.

What happens if I exceed my disk quota?

If the disk volume for your project is full, then neither you nor applications running under your name will be able to write to the disk. Other members of your project will also not be able to write.

Memory

Why do I need to tell the system how much memory I need?

Memory is a limited resource. If we tell the scheduler how much each job needs, it can fit jobs into slots on the system most efficiently. If you don't define your memory needs, the scheduler assumes you need the default of 2GB. That’s not a lot and many applications need more. If you actually try and use more than the scheduler thinks you should, it will kill your job. On the other hand, if you ask for a lot more than you need, you make it harder for the scheduler to squeeze you in and you waste resources. So it's important to get it right.

Remember, if you ask for more memory per core than that core's fair share of what’s available (typically 4GB, but depending on the system, as high as 16GB per core), then you probably make some other core on that node unavailable. Under those conditions, we need to charge your quota for those under-utilized cores as well as the ones you are actually using. Sad but true. Conversely, if you use less than typical memory per core, you often find that the scheduler can squeeze your job into one of those under-utilized cores and it gets to run straight away.

How do I tell the system how much memory I need?

Please see the section on managing memory.