Getting Started
Set up the block and run your first training job
This page walks through preparing the sft block and running a training job end to end. sft wraps a pinned LLaMA-Factory checkout and the swe_data_process converter package, so most setup is about getting those into a single uv environment.
Prerequisites
- 8× A100/H100 GPUs — full fine-tuning with DeepSpeed ZeRO-3 targets a single 8-GPU node.
- uv — used to build and run the training environment.
- CUDA 12.8 toolchain available on the node (the installer pulls CUDA 12.8 PyTorch wheels).
- Raw trajectories from trajgen — a Harbor
job_dirof agent rollouts (see Core Concepts). - A base model — a local path to the model to fine-tune (e.g.
Qwen3-8B).
Where to run
sft trains on the node declared in config.yaml (meta_info.resources.ip). Run its scripts inside a tmux session on that host so a long training job survives shell disconnects.
1. Build the environment
Create the uv training environment at meta_info.environment.sft_uv (default artifacts/env/lf):
bash scripts/install_env.shThis installs repos/swe_data_process[llm], the CUDA 12.8 PyTorch 2.8.0 wheels, repos/LLaMA-Factory[torch,metrics,deepspeed,liger-kernel] (with --no-build-isolation), the pinned flash-attn 2.8.3 wheel, and wandb. See Inputs & Outputs for the env layout.
2. Point the config at your inputs
Edit config.yaml → runtime_info.input:
source.type— the dataset source:harbor_job(convert Harbor trajectories, the default),hf_lf(a ready-made LF dataset on the HuggingFace Hub), orlocal_lf(a local LF json). See Input sources.- For
harbor_job:source.scaffold+source.job_dir— the trajectory scaffold and the Harbor job directory to convert. Forhf_lf:source.hf_hub_url. Forlocal_lf:source.lf_path. conversion.data_name— the dataset name (the LF file and registered dataset key derive from it).model.model_name_or_path— the local base-model path.training.*— hyperparameters (template, cutoff length, batch size, learning rate, epochs, …).infrastructure.n_gpus_per_node— the GPU count on this node.
See Configuration for what each field controls.
3. Validate the config
Run the dry-run preflight. It checks the pinned repos, the uv environment, the converter modules, the source job dir, the model path, the output directory, and the GPU count — without side effects:
bash scripts/dryrun.shFix anything it reports before launching a run.
4. (Optional) Prepare data only
To convert and inspect the dataset before committing to a training run, run the data-only pipeline:
bash scripts/dataprep.shThis runs STEP 0 (conversion) and writes the LF dataset under artifacts/data/lf_data/ without registering it or training. See Data Pipeline.
5. Launch a run
scripts/start.sh runs the dry run, then the full pipeline (conversion → dataset registration → training), and archives the run on exit:
bash scripts/start.shTo run the pipeline directly without the archive wrapper, use bash scripts/train.sh. See Training for what each step does.
6. Inspect results
Training writes to:
artifacts/model/<run>/ # checkpoints, trainer_log.jsonl, trainer_state.json, *_results.json
artifacts/logs/<run>_<ts>.log # console logAfter a successful run, config.yaml → runtime_info.output is updated with the checkpoint path, metrics, and artifact paths. For a live view, open the dashboard.
Operating with the agent plugin
If you operate the block through its Claude plugin, the same lifecycle maps to slash commands:
/root:check sft # preflight: config, repos, env, source data, GPUs
/sft:setup # build the uv environment and validate the config
/root:run sft # execute scripts/start.sh and archive
/sft:run # sft-specific run procedure + post-run bookkeeping
/sft:dashboard # launch / summarize the training dashboard