Job Dependency¶
Overview¶
The dependency feature in CraneSched-FrontEnd allows jobs to control their execution timing based on the status of other jobs, enabling job dependency management. Through dependency relationships, you can build complex workflows to ensure jobs execute in the correct order.
Supported Commands¶
The dependency feature is available in the following commands:
cbatch- Batch job submissioncalloc- Interactive resource allocationcrun- Interactive job execution
Command Line Parameter¶
Use the --dependency or -d parameter when submitting a job to specify dependency relationships.
Dependency String Format¶
Basic Syntax¶
Dependency Types¶
| Type | Description | Trigger Condition |
|---|---|---|
after |
Start after specified job begins or is cancelled | Dependent job leaves Pending state |
afterok |
Start after specified job succeeds | Dependent job completes with exit code 0 |
afternotok |
Start after specified job fails | Dependent job completes with non-zero exit code (including timeout, node errors, etc.) |
afterany |
Start after specified job completes | Dependent job ends (regardless of success or failure) |
Delay Time¶
Optional delay parameter, supporting the following formats:
- Plain numbers (default unit is minutes)
-
Example:
10= 10 minutes -
Time with units
s,sec,second,seconds- secondsm,min,minute,minutes- minutesh,hour,hours- hoursd,day,days- daysw,week,weeks- weeks
Unsupported Format
Do NOT use HH:MM:SS or D-HH:MM:SS format (e.g., 01:30:00 or 1-01:30:00). The colon : character is reserved as the job ID separator, so such formats will be misinterpreted as multiple job IDs instead of delay time. This may either cause "duplicate task" errors or silently succeed with completely wrong dependency behavior. Always use time units instead (e.g., 90m or 1h30m).
Multiple Dependency Combinations¶
AND Logic (all conditions must be satisfied)¶
Use , to separate different dependency conditions:
The job will wait for job 100 to start and job 101 to complete successfully.
OR Logic (any condition satisfied)¶
Use ? to separate different dependency conditions:
The job will start after job 100 or job 101 completes successfully.
Note
You cannot mix , and ? in the same dependency string. The system will return an error.
Usage Examples¶
1. Basic Dependencies¶
# Wait for job 100 to start before running
cbatch --dependency after:100 my_script.sh
# Wait for job 100 to complete successfully before running
cbatch --dependency afterok:100 my_script.sh
# Wait for job 100 to fail before running
cbatch --dependency afternotok:100 my_script.sh
# Wait for job 100 to complete before running (regardless of success or failure)
cbatch --dependency afterany:100 my_script.sh
2. Dependencies with Delays¶
# Wait for job 100 to complete successfully, then delay 30 minutes before running
cbatch --dependency afterok:100+30 my_script.sh
# Wait for job 100 to complete successfully, then delay 10 seconds before running
cbatch --dependency afterok:100+10s my_script.sh
# Delay for 1 hour 30 minutes (use unit-based format)
cbatch --dependency afterok:100+90m my_script.sh
3. Multiple Dependencies¶
# Wait for job 100 to start AND jobs 101, 102 to both complete successfully
cbatch --dependency after:100,afterok:101:102 my_script.sh
# Wait for job 100 to start for 10 minutes AND job 101 to complete successfully for 30 minutes
cbatch --dependency after:100+10m,afterok:101+30m my_script.sh
# Wait for job 100 to succeed OR job 101 to fail
cbatch --dependency afterok:100?afternotok:101 my_script.sh
# Wait for jobs 100, 101 to both succeed with 2 hour delay, or job 102 to start immediately
cbatch --dependency afterok:100:101+2h?after:102 my_script.sh
4. Using in Batch Scripts¶
You can also use the #CBATCH directive in batch scripts:
#!/bin/bash
#CBATCH --dependency afterok:100
#CBATCH --nodes 2
#CBATCH --time 1:00:00
#CBATCH --output job-%j.out
echo "This job starts after job 100 completes successfully"
# Your job code
5. Using in Interactive Commands¶
# Using dependency with calloc
calloc --dependency afterok:100 -n 4 -N 2
# Using dependency with crun
crun --dependency after:100 -n 1 hostname
Viewing Dependency Status¶
Use the ccontrol show job <job_id> command to view job dependency status:
Output Example¶
Dependency Status Field Descriptions¶
| Field | Description |
|---|---|
PendingDependencies |
Dependencies not yet triggered |
DependencyStatus |
Dependency satisfaction status (see table below) |
Dependency Status Values¶
| Status | Description |
|---|---|
WaitForAll |
Waiting for all dependencies to be satisfied (AND logic) |
WaitForAny |
Waiting for any dependency to be satisfied (OR logic) |
ReadyAfter <time> |
Will be ready after the specified time |
SomeFailed |
Some dependencies failed (AND logic, cannot be satisfied) |
AllFailed |
All dependencies failed (OR logic, cannot be satisfied) |
Error Handling¶
The system will return errors in the following situations:
| Error Condition | Description | Example |
|---|---|---|
| Mixed separators | Cannot use , and ? together |
afterok:100,afterok:101?afterok:102 |
| Format error | Dependency string doesn't conform to syntax | afterok: or after100 |
| Invalid delay format | Delay time format is incorrect | afterok:100+invalid |
| Duplicate dependency | Same job ID appears multiple times | afterok:100:100 |
| Job ID doesn't exist or ended | Dependent job doesn't exist (runtime check) | afterok:99999 |
| Unsupported time format | Using : in delay (misinterpreted as job IDs) |
after:1+00:00:01 or after:1+00:00:02 (parsed as multiple job IDs, may or may not error) |
Error Examples¶
# Error: Mixed AND and OR separators
$ cbatch --dependency afterok:100,afterok:101?afterok:102 job.sh
# Error: Invalid delay format
$ cbatch --dependency afterok:100+invalid job.sh
# Error: Duplicate job ID
$ cbatch --dependency afterok:100,afternotok:100 job.sh
# Error: Using colon in delay time (misinterpreted as job ID separator)
$ cbatch --dependency after:1+00:00:01 job.sh
# This will be parsed as: after jobs 1, 00, 00, 01
# Error message: "duplicate task 1 in dependencies" (job 1 appears twice)
$ cbatch --dependency after:1+00:00:02 job.sh
# This will be parsed as: after jobs 1, 00, 00, 02
# May succeed but with wrong behavior (waits for jobs 00 and 02, ignores the 1-second delay)
# Correct: Always use time units
$ cbatch --dependency after:1+1s job.sh
$ cbatch --dependency after:1+2s job.sh