Using Allas in batch jobs
The Allas initiation command allas-conf
opens an Allas connection that is valid for eight hours.
In the case of interactive usage this eight-hour limit is not problematic as allas-conf can be
executed again to extend the validity of the connection.
In the case of batch jobs, the situation is different, as the execution of a batch job can take several days, and in some cases, it may take more than eight hours before the job even starts. In these cases, you should open Allas connection with command:
The above command should be executed in the shell session that you intend to use to launch your batch job. In the command, the option-k
indicates that the password, entered for allas-conf, will be
stored in the environment variable $OS_PASSWORD
. With this variable defined, you no longer need to
define the password when you re-execute allas-conf with the -k option and the Allas project name.
You can define the project name either explicitly:
Or use the $OS_PROJECT_NAME variable that was assigned when the connection was first opened:
The two commands above now set up the Allas connection for eight hours without prompting the user.
Note that if you mistype your password when using the -k option, you must use unset
command to reset the OS_PASSWORD variable before
you can try again:
-f
to the
command, to skip certain internal checks that are not compatible with batch jobs.
Further, allas-conf is just an alias of a source command that reads the Allas configuration script allas_conf
.
This aliased command is not available in batch jobs, so instead of allas-conf, you must use the command:
Puhti:
Mahti:Thus after opening an Allas connection with the commands
You can add the above mentioned source commands to your batch job script to make sure that the Allas connection is valid when needed.In a-commands (a-put, a-get, a-list, a-delete), this feature is included, so you do not need to add the
configuration commands to the batch job script, but you must still remember to run allas-conf -k
before
submitting the job:
#!/bin/bash
#SBATCH --job-name=my_allas_job
#SBATCH --account=project_2012345
#SBATCH --time=48:00:00
#SBATCH --mem-per-cpu=1G
#SBATCH --partition=small
#SBATCH --output=allas_output_%j.txt
#SBATCH --error=allas_errors_%j.txt
#download data
a-get 178-data-bucket/dataset34/data2.txt.zst
#do the analysis
my_analysis_command -in dataset34/data2.txt -outdir results34
#upload results
a-put -b 178-data-bucket results34
If you use rclone or swift instead of the a-commands, you need to add the source commands to your script. In this case, the batch job script for Puhti could look like:
#!/bin/bash
#SBATCH --job-name=my_allas_job
#SBATCH --account=project_2012345
#SBATCH --time=48:00:00
#SBATCH --mem-per-cpu=1G
#SBATCH --partition=small
#SBATCH --output=allas_output_%j.txt
#SBATCH --error=allas_errors_%j.txt
#make sure connection to Allas is open
source /appl/opt/csc-cli-utils/allas-cli-utils/allas_conf -f -k $OS_PROJECT_NAME
#download input data
rclone copy allas:178-data-bucket/dataset34/data2.txt ./
#do the actual analysis
my_analysis_command -in dataset34/data2.txt -outdir results34
#make sure connection to Allas is open
source /appl/opt/csc-cli-utils/allas-cli-utils/allas_conf -f -k $OS_PROJECT_NAME
#upload results to allas
rclone copyto results34 allas:178-data-bucket/
/appl/opt/csc-tools/allas-cli-utils/allas_conf
instead of /appl/opt/csc-cli-utils/allas-cli-utils/allas_conf
in all places where you need to make sure the connection is open.