# Configure Parameters and Fine-Tune the Model

**Configure Parameters and Fine-Tune the Model**

* *(Optional)* Modify `model_name_or_path` and `template` in `settings.jsonc` to use another locally downloaded model.
* Adjust `per_device_train_batch_size` and `gradient_accumulation_steps` to control **VRAM usage**.
* Depending on the **quantity and quality** of your dataset, you can modify the following in `train_sft_args` to fine-tune performance:
  * `num_train_epochs`
  * `lora_rank`
  * `lora_dropout`

**Single-GPU Training**

Run the following command to start fine-tuning with a single GPU:

```
weclone-cli train-sft
```

If you're in a **multi-GPU environment** but want to use only one GPU, run this command first:

```
export CUDA_VISIBLE_DEVICES=0
```

**Multi-GPU Training**

1. Uncomment the `deepspeed` line in `settings.jsonc`.
2. Install Deepspeed:

```
uv pip install deepspeed
```

3. Start multi-GPU training (replace `number_of_gpus` with the number of GPUs you want to use):

```
deepspeed --num_gpus=number_of_gpus weclone/train/train_sft.py
```

**Run Web Demo for Inference**

You can use this step to **test appropriate `temperature` and `top_p` values**, and then update the `infer_args` in `settings.jsonc` for future inference.

```
weclone-cli webchat-demo
```

\
**Run API Server for Inference**

```
weclone-cli server
```

**Test with Common Chat Scenarios**

These test cases **exclude any personal information inquiries**, and focus on everyday conversations.\
Test results will be saved to `test_result-my.txt`.

```
weclone-cli server
weclone-cli test-model
```
