A new beginning
This commit is contained in:
135
ACENET_HPC_Guide.md
Normal file
135
ACENET_HPC_Guide.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# Using Compute Canada / ACENET HPC
|
||||
|
||||
This guide explains how to connect to the Digital Research Alliance of Canada (Compute Canada) or ACENET clusters, create a working directory in scratch, transfer files with Globus, and submit jobs using SLURM.
|
||||
|
||||
## 1. Connect to the HPC via SSH
|
||||
|
||||
1. Determine which cluster to use (examples):
|
||||
- Graham: `graham.computecanada.ca`
|
||||
- Cedar: `cedar.computecanada.ca`
|
||||
- Beluga: `beluga.computecanada.ca`
|
||||
- Niagara: `niagara.scinet.utoronto.ca`
|
||||
- ACENET: `login1.acenet.ca`
|
||||
|
||||
2. Open a terminal and connect via SSH:
|
||||
|
||||
```bash
|
||||
ssh username@graham.computecanada.ca
|
||||
```
|
||||
|
||||
3. When prompted, confirm the host fingerprint and enter your password.
|
||||
|
||||
---
|
||||
|
||||
## 2. Create a Folder in Scratch
|
||||
|
||||
Your `$SCRATCH` directory is a temporary workspace for large data and computations. It is purged after 60 days of inactivity.
|
||||
|
||||
After logging in:
|
||||
|
||||
```bash
|
||||
cd $SCRATCH
|
||||
mkdir my_project
|
||||
cd my_project
|
||||
```
|
||||
|
||||
Confirm your path:
|
||||
|
||||
```bash
|
||||
pwd
|
||||
# Example output: /scratch/username/my_project
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Install and Use Globus for File Transfers
|
||||
|
||||
Globus is a fast, reliable tool for large file transfers. It requires a small local agent called **Globus Connect Personal**.
|
||||
|
||||
### Install Globus Connect Personal
|
||||
|
||||
- **Linux:**
|
||||
```bash
|
||||
wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
|
||||
tar xzf globusconnectpersonal-latest.tgz
|
||||
cd globusconnectpersonal*
|
||||
./globusconnectpersonal -setup
|
||||
```
|
||||
|
||||
- **macOS:**
|
||||
Download and install from: [https://www.globus.org/globus-connect-personal](https://www.globus.org/globus-connect-personal)
|
||||
|
||||
- **Windows:**
|
||||
Download the installer from the same link and follow the setup wizard.
|
||||
|
||||
After installation, your local computer will appear as a **Globus endpoint**.
|
||||
|
||||
### Transfer Files
|
||||
|
||||
1. Visit [https://app.globus.org](https://app.globus.org) and log in using **Compute Canada credentials**.
|
||||
2. In the web app, choose two endpoints:
|
||||
- **Source:** Your local computer or institutional storage.
|
||||
- **Destination:** Your HPC endpoint (for example, *Compute Canada Graham Scratch*).
|
||||
3. Navigate to your target scratch folder (`/scratch/username/my_project`).
|
||||
4. Select files and click **Start Transfer**.
|
||||
|
||||
Globus will handle transfers asynchronously and resume interrupted transfers automatically.
|
||||
|
||||
---
|
||||
|
||||
## 4. Submit Jobs to ACENET with SLURM
|
||||
|
||||
Job submissions use the SLURM scheduler. Create a batch file describing your job resources and commands.
|
||||
|
||||
### Example job script (`job.slurm`)
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=my_analysis
|
||||
#SBATCH --account=def-yourprof
|
||||
#SBATCH --time=2:00:00
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --ntasks=4
|
||||
#SBATCH --mem=8G
|
||||
#SBATCH --output=output_%j.log
|
||||
|
||||
module load python/3.11
|
||||
source ~/myenv/bin/activate
|
||||
|
||||
python my_script.py
|
||||
```
|
||||
|
||||
### Submit and Monitor Jobs
|
||||
|
||||
```bash
|
||||
sbatch job.slurm # Submit job
|
||||
squeue -u username # Check status
|
||||
scancel job_id # Cancel job
|
||||
```
|
||||
|
||||
### View Results
|
||||
|
||||
After completion, check output logs:
|
||||
|
||||
```bash
|
||||
less output_<jobid>.log
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Useful Commands
|
||||
|
||||
```bash
|
||||
module avail # List available software modules
|
||||
module load python/3.11 # Load a module
|
||||
df -h $SCRATCH # Check scratch usage
|
||||
quota -s # Check your disk quota
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. References
|
||||
|
||||
- Alliance Docs: [https://docs.alliancecan.ca/wiki/Technical_documentation](https://docs.alliancecan.ca/wiki/Technical_documentation)
|
||||
- ACENET Training: [https://www.ace-net.ca/training/](https://www.ace-net.ca/training/)
|
||||
- Globus Setup: [https://www.globus.org/globus-connect-personal](https://www.globus.org/globus-connect-personal)
|
||||
Reference in New Issue
Block a user