Job Status and Tracking
Once your job is validated and all prerequisites are met, it will launch and be listed under the "Previous Jobs" section.
A job can be in one of four states:
- Launching
- In Progress
- Completed
- Failed
When the Job is In Progress
After launching, it typically takes 4-5 minutes for the job to enter the In Progress state.
During this phase, you can view job logs by selecting View Logs from the Select an Action menu on your Job card.
If you’ve provided WandB credentials, you can also track the job run using the "View Metrics" option.
The most recent adapter checkpoints can be downloaded while the program is running. Checkpoints are saved four times per epoch. If the job’s duration is less than one epoch, checkpoints are saved four times throughout the entire run.
When the Job is Completed
Upon completion, the Download option becomes available under Select an Action.
Clicking this will download a zip file containing all the fine-tuned model files.
If you provided HuggingFace credentials, the model files will also be automatically uploaded to your specified HuggingFace repository.
Updated 4 months ago