Training models for InstructLab
Train your model on generated data, then test the model to verify the results. Learn more about what training is.
Configuration information or files cannot be passed to the model for fine tuning.
Prerequisites
- Prepare your taxonomy
- Add the taxonomy
tar.gzto your Object Storage bucket. You can use the CLI or the UI. - Generate data from your taxonomy.
Aligning models by using the console
-
From the InstructLab Projects page, Click your project > Aligned models > Align.
-
Enter an alphanumeric name for the model and select the training data to use.
-
Optional: Review the estimated cost that is provided before you start the alignment process.
-
Click Align. The state is
queued, thenrunning. Wait for the state to becompleted. This process could take minutes or hours. When the alignment is complete, in the Object Storage bucket, atrained_modelsdirectory is created with logs for troubleshooting.
Training models by using the CLI
-
Get the ID for the data to use.
ibmcloud ilab data list -
Run the command to start training the model with the generated data. Note the ID.
ibmcloud ilab model train --name testmodel --data-id <data_id> -
Check the details of your data generation. Include the ID for the model. The state is
queued, thenrunning. Wait for the state to becompleted. This process could take minutes or hours. When the state iscompleted, in the Object Storage bucket, atrained_modelsdirectory is created with logs for troubleshooting.ibmcloud ilab model get --id <model_id> -
Optional: When the state is
completed, you can review metrics, such as token estimates to calculate the estimated cost.Example
model getcommand with the--output jsonoption.ibmcloud ilab model get --id daef9836-631f-4686-ad18-e0e6a0910f5d --output jsonExample JSON output
{ "base_model": "granite-3.1-8b-starter-v2.1", "created_at": "2025-12-08T16:06:05.000Z", "data_id": "8b1433c0-e375-4b00-b36d-2ad00697014e", "id": "daef9836-631f-4686-ad18-e0e6a0910f5d", "last_signal_at": "2025-12-08T17:20:32.000Z", "model_metrics": { "mmlu": { "overall_average": 0.51, "scores": { "mmlu_abstract_algebra": 0.3, "mmlu_anatomy": 0.43, "mmlu_astronomy": 0.49 } }, "mt_bench": { "error_rate": 0.01, "overall_average": 6.86, "scores": { "turn_one": 7.25, "turn_two": 6.47 } }, "mt_bench_branch": { "error_rate": 0.01, "improvements": { "compositional_skills/STEM/math/time_series/qna.yaml": 8.67, "compositional_skills/extraction/invoice/csv/qna.yaml": 8.4 }, "no_change": { "compositional_skills/roleplay/explain_like_i_am/primary_schooler/qna.yaml": 0, "compositional_skills/writing/freeform/technical/proposal/qna.yaml": 0 }, "regressions": { "compositional_skills/extraction/inference/qualitative/sentiment/qna.yaml": -9, "compositional_skills/extraction/information/named_entities/places/qna.yaml": -9 } }, "tokens": { "training_phases": {} } }, "name": "test", "state": "completed", "status": "completed", "taxonomy_id": "e62ccea5-97e6-4568-86bf-2f359987b115" }
Training models by using the API
-
Get the ID for the data to use.
Example command.
curl -X 'GET' \ 'https://us-east.instructlab.ibm.com/v1/data' \ -H 'accept: application/json'Example output.
{ "data": [ { "id": "add785e6-a8c3-4f5f-ab89-c506a3f115da", "name": "example-data-1", "state": "", "status": "queued", "created_at": "2024-10-23T02:58:50.000Z", "taxonomy_id": "202a03c4-dcf1-432a-82b7-abecb2e019f7", "last_signal_at": "2025-12-08T17:20:32.000Z" } ] } -
Run the command to start training the model with the generated data. Note the ID.
Example command.
curl -X 'POST' \ 'https://us-east.instructlab.ibm.com/v1/models' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "name": "example-model-1", "data_id": "add785e6-a8c3-4f5f-ab89-c506a3f115da" }'Example output.
{ "id": "baa8cfb5-e306-4e15-869d-735b74b1919d", "name": "example-model-1", "state": "", "status": "queued", "created_at": "2024-10-23T02:58:50.000Z", "last_signal_at": "2025-12-08T17:20:32.000Z", "data_id": "add785e6-a8c3-4f5f-ab89-c506a3f115da", "base_model": "granite-7b", "taxonomy_id": "202a03c4-dcf1-432a-82b7-abecb2e019f7", "model_metrics": { "mmlu": { "overall_average": 0.3, "scores": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } }, "mmlu_branch": { "error_rate": 0.4, "improvements": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "regressions": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "no_change": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } }, "mt_bench": { "overall_average": 0.8, "error_rate": 0.6, "scores": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } }, "mt_bench_branch": { "error_rate": 0.4, "improvements": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "regressions": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "no_change": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } } } } -
Check the details of your data generation. Include the ID for the model. The state is
queued, thenrunning. Wait for the state to becompleted. This process could take minutes or hours.Example command.
curl -X 'GET' \ 'https://us-east.instructlab.ibm.com/v1/models/baa8cfb5-e306-4e15-869d-735b74b1919d' \ -H 'accept: application/json'Example output.
{ "id": "baa8cfb5-e306-4e15-869d-735b74b1919d", "name": "example-model-1", "state": "", "status": "queued", "created_at": "2024-10-23T02:58:50.000Z", "last_signal_at": "2025-12-08T17:20:32.000Z", "data_id": "add785e6-a8c3-4f5f-ab89-c506a3f115da", "base_model": "granite-7b", "taxonomy_id": "202a03c4-dcf1-432a-82b7-abecb2e019f7", "model_metrics": { "mmlu": { "overall_average": 0.3, "scores": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } }, "mmlu_branch": { "error_rate": 0.4, "improvements": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "regressions": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "no_change": { "additionalProp1": 1, "additionalProp3": 3, "additionalProp2": 2 } }, "mt_bench": { "overall_average": 0.8, "error_rate": 0.6, "scores": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } }, "mt_bench_branch": { "error_rate": 0.4, "improvements": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "regressions": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 }, "no_change": { "additionalProp1": 1, "additionalProp2": 2, "additionalProp3": 3 } } } }
When the state is completed, in the Object Storage bucket, a trained_models directory is created with logs for troubleshooting.
What's in my Object Storage bucket after training?
After training the model, your Object Storage bucket contains a trained models directory with the following files.
Artifacts- These files contain the Phase 1 and Phase 2 checkpoint data and the model for each epoch.
Eval- These files contain the evaluation metrics for the
mmlu,mmlu_branch,mt_benchandmt_bench_branchbenchmarks. Logs- These files contain the Red Hat AI InstructLab execution logs and system details.
Model- These files contain the final outputted model in
safetensorsformat. The contents of this directory are used by the model.
What's next?
Optional: You can deploy the model.