Prolog/Epilog Configuration Guide¶
To enhance the flexibility and controllability of job scheduling, Crane supports the automatic invocation of user-defined Prolog (pre-execution) and Epilog (post-execution) scripts at different stages of job allocation and step execution, enabling functionalities such as job environment initialization, resource verification, and cleanup.
Crane supports multiple prolog and epilog programs. Note that for security reasons, these programs do not have a search path set. You must either specify fully qualified paths in the programs or set the PATH environment variable. The table below explains the prolog and epilog programs available during job allocation, including when and where they run.
| Parameter | Location | Invoked by | User | Execution Timing |
|---|---|---|---|---|
| Prolog (config.yaml) | Compute node | craned | CranedUser (usually root) | When a job step first starts on the node (default); |
| CranectldProlog (config.yaml) | Controller node | cranectld | CranectldUser | At job allocation |
| Epilog (config.yaml) | Compute node | craned | CranedUser (usually root) | At job completion |
| CranectldEpilog (config.yaml) | Controller node | cranectld | CranectldUser | At job completion |
The table below describes the prolog and epilog programs available during job step execution, including when and where they run.
| Parameter | Location | Invoked by | User | Execution Timing |
|---|---|---|---|---|
CrunProlog (config.yaml or crun --prolog) |
crun launch node | crun | User running crun | Before job step launch |
| TaskProlog (config.yaml) | Compute node | csupervisor | User running crun | Before job step launch |
crun --task-prolog |
Compute node | csupervisor | User running crun | Before job step launch |
| TaskEpilog (config.yaml) | Compute node | csupervisor | User running crun | When job step completes |
crun --task-epilog |
Compute node | csupervisor | User running crun | When job step completes |
CrunEpilog (config.yaml or crun --epilog) |
crun launch node | crun | User running crun | When job step completes |
By default, the Prolog script only runs on a node when it receives its first job step from a new allocation. It does not run at the moment the allocation is granted. If no job step from an allocation ever runs on a node, that node will not run the Prolog for that allocation. This behavior can be changed with the PrologFlags parameter.
Epilog always runs on each node when the allocation is released.
If multiple prolog or epilog scripts are specified (e.g., /etc/crane/prolog.d/*), they will run in reverse alphabetical order (z→a → Z→A → 9→0).
Prolog and Epilog scripts should be short and must not call Crane commands such as cqueue, ccontrol, cacctmgr. Long-running scripts slow down scheduling. Calling Crane commands may also cause performance issues.
TaskProlog runs with the same environment as the user’s task. Its standard output is interpreted as:
export name=value: set environment variableunset name: unset environment variableprint ...: write to task stdout
Example TaskProlog:
#!/bin/bash
echo "export VARIABLE_1=HelloWorld"
echo "unset MANPATH"
echo "print This message has been printed with TaskProlog"
Failure Handling¶
- If a Prolog fails (non-zero exit) → the node is set to DRAIN and the job is failed.
- If an Epilog fails → the node is set to DRAIN.
- If CranectldProlog fails → the job is failed.
- If CranectldEpilog fails → a log is written.
- If task prolog fails → the task is failed.
- If crun prolog fails → the step will not be executed; the frontend will directly return a failure.
- If task epilog or crun epilog fails → a log is written.
Prolog/Epilog Configuration¶
- PrologTimeout: Timeout (in seconds) for the execution of the Prolog script.
- EpilogTimeout: Timeout (in seconds) for the execution of the Epilog script.
- PrologEpilogTimeout: Timeout (in seconds) for the execution of both Prolog and Epilog scripts. When this parameter is set together with PrologTimeout and EpilogTimeout, it will override both of them.
- PrologFlags: Controls how the Prolog script is executed. Multiple flags can be specified, separated by commas, to provide more flexible job lifecycle management.
Configuration Example:
JobLifecycleHook:
Prolog: /path/to/prolog.sh
PrologTimeout: 60
# PrologFlags: Contain # Contain, RunInJob, Serial
Epilog: /path/to/epilog.sh
EpilogTimeout: 60
PrologEpilogTimeout: 120
CranectldProlog: /path/to/cranectld_prolog.sh
CranectldEpilog: /path/to/cranectld_epilog.sh
CrunProlog: /path/to/crun_prolog.sh
CrunEpilog: /path/to/crun_epilog.sh
TaskProlog: /path/to/task_prolog.sh
TaskEpilog: /path/to/task_epilog.sh
Prolog Flags¶
Contain¶
Runs Prolog inside job cgroup at allocation time.
RunInJob¶
Runs Prolog/Epilog inside extern csupervisor, included in job cgroup.
Implies Contain.
Serial¶
Runs Prolog/Epilog serially per node.
Reduces throughput.
Incompatible with RunInJob.
Example¶
/etc/crane/prolog.sh Make sure the script has executable permission and that the script itself is correct.
#!/bin/bash
LOG_FILE="/var/crane/prolog.log"
JOB_ID=$CRANE_JOB_ID
ACCOUNT=$CRANE_JOB_ACCOUNT
NODE_NAME=$CRANE_JOB_NODELIST
DATE=$(date "+%Y-%m-%d %H:%M:%S")
echo "[$DATE] === Prolog Start ===" >> $LOG_FILE
echo "JOB_ID: $JOB_ID" >> $LOG_FILE
echo "ACCOUNT: $ACCOUNT" >> $LOG_FILE
echo "NODE: $NODE_NAME" >> $LOG_FILE
# Check node health (example)
FREE_MEM_MB=$(free -m | awk 'NR==2 {print $4}')
if (( FREE_MEM_MB < 200 )); then
echo "Node memory low: ${FREE_MEM_MB}MB → reject job" >> $LOG_FILE
exit 1 # Non-zero → block job execution
fi
# Output ending flag
echo "=== Prolog End ===" >> $LOG_FILE
echo "" >> $LOG_FILE
exit 0
/etc/crane/config.yaml Configuration