Configure Parameters and Fine-Tune the Model

Configure Parameters and Fine-Tune the Model

(Optional) Modify model_name_or_path and template in settings.jsonc to use another locally downloaded model.
Adjust per_device_train_batch_size and gradient_accumulation_steps to control VRAM usage.
Depending on the quantity and quality of your dataset, you can modify the following in train_sft_args to fine-tune performance:
- num_train_epochs
- lora_rank
- lora_dropout

Single-GPU Training

Run the following command to start fine-tuning with a single GPU:

weclone-cli train-sft

If you're in a multi-GPU environment but want to use only one GPU, run this command first:

export CUDA_VISIBLE_DEVICES=0

Multi-GPU Training

Uncomment the deepspeed line in settings.jsonc.
Install Deepspeed:

uv pip install deepspeed

Start multi-GPU training (replace number_of_gpus with the number of GPUs you want to use):

deepspeed --num_gpus=number_of_gpus weclone/train/train_sft.py

Run Web Demo for Inference

You can use this step to test appropriate temperature and top_p values, and then update the infer_args in settings.jsonc for future inference.

weclone-cli webchat-demo

Run API Server for Inference

weclone-cli server

Test with Common Chat Scenarios

These test cases exclude any personal information inquiries, and focus on everyday conversations. Test results will be saved to test_result-my.txt.

weclone-cli server
weclone-cli test-model

PreviousData Preprocessing Next🤖 Deploy to Chatbot

Last updated 8 months ago