Discovery Courses FAQ

Courses FAQ

There are several use cases for teaching and learning with the Discovery cluster in a classroom. The cluster offers a command line and web interface, allowing access and instruction-level flexibility. Using the cluster gives you, and your students access to many popular scientific applications and allows you and your students to install other packages as needed. The easy-to-use Open OnDemand web portal also offers a built-in visual file explorer for viewing and transferring.

The following Frequently Asked Questions should help answer most of your questions about using the cluster for classroom use.

 

How can I use Discovery with my class?

 

There are several ways you can use the cluster in your classroom. Your class can use the cluster to access specific software packages and working environments and learn how to utilize high-performance computing (HPC) resources for large and complex data processing, such as machine learning; AI and molecular simulations; and more.

 

How do I get my class access to the cluster?

 

You fill out an HPC Classroom Use Request ticket. While filling out this form, please provide the course number and section from the dropdown menu supplied, and we will pull the students names directly from canvas. We will create accounts on the cluster for them as well as any TAs or co-instructors. We will also make a directory tree under /courses for the class. If there are any changes in enrollment please let us know so we can update your course. There is no need to send us a list of students names as we pull them directly from canvas.

 

Is there any training on the cluster for my class?

 

Yes, we currently provide virtual classroom facilitation for your class on using the cluster. You can also direct your students to our training page, where they can easily view some of our most popular training sessions. We can also customize training to focus on the specific resources you will be using for the class. Email us at rchelp@northeastern.edu and provide details about your class and the training you would like us to provide.

 

Do my students have to learn Linux to work with the cluster?

 

Depending on your class assignments, many students can work with the Open OnDemand web portal, which does not require any knowledge of Linux. In cases where you want students to work on the command line, they should have a basic understanding of Linux commands. If students are unfamiliar with Linux, you can direct them to our training page, where they can view a Linux and Shell Scripting training session. 

 

What software is available to use with my class on the cluster?

 

Many software packages are available, including popular software apps such as Jupyter Notebook, RStudio, and MATLAB. If you have an account on the cluster, you can see the list of available software by using the  module avail command (run in the command line). See Using Module for more information. Students have access to most of the modules on the cluster. They can also use the interactive apps available on Open OnDemand. Instructors can install many other software packages using Conda.

 

I just need my class to access Open OnDemand. How do I request that?

 

Open OnDemand is a web portal that lets you access the resources on the cluster through an easy-to-navigate web browser interface. You can request course access using the same form as you would  to request access to the cluster.

 

I’d like my class to use specific resources on the cluster. Can you create a reservation on the cluster for my class?

 

We have dedicated partitions for courses, which provide CPU (–partition=”courses”) and GPU (–partition=”courses-gpu”) resources. We no longer create reservations for each course. However, we allocate the resources for the classroom partitions each term based on need as described by the instructors.

 

How long do my students have access to the cluster?

 

Students will have access to the cluster for the duration of the course. If the students would like to maintain their access, they must submit an access request for an individual account.

 

How do I get an account on the cluster?

 

If you are a professor or instructor at Northeastern, you can complete an access request for an account on the cluster.

 

 

How do my students get help with the cluster?

 

RC Office Hours are a great way for your students to connect with the RC team for short (10-15 min) consultations. Office Hours are held every Wednesday from 3 – 4 p.m. ET and Thursday from 11 a.m. – 12 p.m. ET. All current or prospective Discovery users are welcome to join anytime during these hours.

 

You or your students can also submit a Get Assistance with Research Computing ticket or email rchelp@northeastern.edu.

 

My class needs access to a specific software application that I do not see installed on the cluster or Open OnDemand (OOD). What should I do?

 

If your class requires software not currently installed on the cluster or OOD, and you are unable to install it in a conda environment, follow the procedure below to request that software be installed on the cluster.
 
You must be a professor or instructor to initiate this request; if your students need a specific software application, you must complete the form for them. Students in your class cannot submit this request. This is to ensure that we only get one request for the software; multiple students in one class often make requests for the same software, so having all requests go through the instructor reduces this overlap.

To request additional software (instructors only):

  1. Go to Discovery Cluster Software Request. If prompted, sign in to ServiceNow with your Northeastern username and password to access the form.
  2. In the Sponsor’s Name field, enter your name.
  3. Make sure to follow the instructions on the form regarding either providing the URL of the open-source software library or uploading the installation package in your home directory if it requires you to register it first.
  4. Select the acknowledgment checkbox, and select Submit.

You and your students can install software locally to your PATH on the cluster, which may be a better option in some cases, such as installing multiple conda environments. Review the Software Overview for more information.

 

Please note, software requests can take 2-3 weeks to complete the installation. We might not be able to install every software application requested. If so, we will notify you and provide alternative software to meet your needs.

 

 

Course Guide

Research Computing supports classroom education at Northeastern University by providing access to computing resources (CPU and GPU) and storage resources for instructors and their students. RC has supported courses from many disciplines, including biology, chemistry, civil engineering, machine learning, computer science, mathematics, and physics.

To gain access to HPC resources instructors need to submit a classroom access form. Please submit these requests prior to the beginning of each semester (preferred), or at least one week prior to the start of when you plan on using the HPC cluster for your class. If you’re requesting a customized application we require two-weeks to one-month time to complete prior to when you’d like to use it.

Classroom Setup

Once access is provided, each course will have a course-specific directory under /courses/ following this sample file tree. As shown for the course BINF6430.202410 below:

/courses/
└── BINF6430.202410/
    ├── data/
    ├── shared/
    ├── staff/
    └── students/

The sub-directory staff/ will be populated with a folder for each of the following: instructors, co-instructors, and TAs. The students/ sub-directory contains a folder for each student. The data/ and shared/ sub-directories can be populated by those in staff but is read-only for students. Students only have permission to read into their own directories under students/ and cannot view into another students space.

All users in staff have read-write-execute permissions within the entirety of their courses directory, allowing them to store data, homework assignments, build conda environments, create new directories, etc.

Each course directory gets a default 1TB of storage space. This amount can be increased in the initial application form for classroom access, or requested anytime during an actively running course, by contacting rchelp@northeastern.edu

Once the course has ended, and final grades have been submitted, the courses space including all data and shared class files will be archived, and all student personal directories will be deleted. Any students given access to the HPC cluster only though the course will no longer have access when the course is completed.

Courses Partitions

RC has dedicated two partitions to the use of students and instructors for the duration of their course.

The resources available in the courses/courses-gpu partitions can be queried with the command sinfo as run in the command line. We manage the resources in courses/courses-gpu each term in response to the number of courses and requested usage per course.

sinfo -p courses-gpu --Format=nodes,cpus,gres,statecompact

These partitions can be used for an sbatch scrip or an srun intactive session.

Sbatch Script

An sbatch script can be submitted on the command line via the command sbatch scriptname.sh. Below are some examples of sbatch scripts using the courses and courses-gpu partitions. See slurm-running-jobs for more information on running sbatch scripts or run man sbatch for additional sbatch parameters.

Courses Partition: Commands to Execute

#!/bin/bash

#SBATCH –nodes=1
#SBATCH –time=4:00:00
#SBATCH –job-name=MyCPUJob
#SBATCH –partition=courses
#SBATCH –mail-type=ALL
#SBATCH –mail-users=username@northeastern.edu

Courses-gpu Partition: Commands to Execute

#!/bin/bash

#SBATCH –nodes=1
#SBATCH –time=4:00:00
#SBATCH –job-name=MyGPUJob
#SBATCH –partition=courses-gpu
#SBATCH –gres=gpu:1
#SBATCH –mail-type=ALL
#SBATCH –mail-users=username@northeastern.edu

Srun Interactive Session

An interactive session can be run on the command line via the srun command as shown in the examples below. See slurm-running-jobs for more information on using srun or run man srun to see additinal parameters that can be set with srun.

Courses Partition

srun –time=4:00:00 –job-name=MyJob –partition=courses –pty /bin/bash

Courses-gpu Partition

srun –time=4:00:00 –job-name=MyJob –partition=courses-gpu –gres=gpu:1 –pty /bin/bash

Open OnDemand

Several widely-used applications are available on the Open OnDemand (OOD) web portal including, Jupyterlab Notebook, Rstudio, Matlab, and GaussView.

All of the applications under the “Courses” tab on the dashboard can be set to either the courses or courses-gpu partitions via the applications specific pull down menus.

Monitoring Jobs

Whichever way you choose to run your jobs, you can monitor their progress with the command squeue.

squeue -u username

You can also monitor jobs being run on either of the courses partitions.

squeue -p courses
squeue -p courses-gpu

Jobs can be canceled with the command scancel and the slurm job id that is assigned when your job is submitted to the scheduler.

scancel jobid

Please note, the cluster is a collection of shared resources. Please cancel any jobs that are still running in an interactive session (on the OOD or via srun) when you have completed your work. This frees up the resources for other classmates and instructors.

Software Applications

RC has installed many system-wide software applications as modules that are available through the command line via the module command. RC also supports many software applications specifically for courses and have added interactive versions to the Open OnDemand including: Jupyterlab notebook, Rstudio, Matlab, VSCode, Maestro (Schrodinger), and a unix Desktop.

Professors should create custom conda environments for their course which can be used in JupyterLab notebook or used in interactive mode (srun) or sbatch scripts on the command line.

Custom Course Applications

At Northeastern University, instructors have a great deal of flexibility in how they use the HPC for their classroom, and this is most apparent in the use of software applications.

RC encourages professors to perform local software installations via conda environments within the /courses directory for their class. These can be used by the students to complete tutorials and homework assignments. Students can also create their own conda environments in their /courses/course.code/students/username directory to complete their own projects. Conda environments can be used to install a variety of research software and are not only useful for coding in python.

For most courses, the instructor is able to create a shared conda environment in their  /courses directory that can provide all the necessary packages for the class.

In other cases, where specialized software is needed, please drop into RC Office Hours or book a Classroom Consultation with one of the RC team members to discuss what you needed. Please allow at least one month for specialized app development and testing. Please note that RC may be unable to provide the exact specifications requested, however, the RC team will work with the instructor to find a suitable solution.

Courses Cheatsheet

How Can Research Computing Support You?

Accelerate your research at any stage by leveraging our online user guides, hands-on training sessions, and one-on-one guidance.

Documentation

Training

Consultations & Office Hours

Contact Us