Dataset Preparation
Once you have chosen an LLM and are on the Dataset Preparation page:
Choose a Task and Dataset:
- Click on the Select a Task dropdown and choose the type of task you are training over, such as:
![](https://files.readme.io/dde5071-spaces_uWXaKg3nypyiLCPH1Pv1_uploads_DXtMlkRYJ4dY3dRJVDUO_Screenshot_from_2023-08-01_09-28-33.webp)
- Then click on the Choose Dataset dropdown. You have 3 options here:
- Choose a pre-curated HuggingFace Dataset,
- Specify your own choice of HuggingFace Dataset by selecting "other" option, or
- Use your own dataset uploaded via 'Dataset management' portal.
Using HuggingFace Datasets:
- If you want to use a pre-curated HuggingFace dataset then simply select it from the dropdown list.
- If you want to use any unlisted HuggingFace dataset then choose "other" option and mention the name of HuggingFace dataset to be fetched as shown below:
![](https://files.readme.io/d4dfb84-finetune2.png)
Using your own Datasets:
If you have already uploaded a dataset using 'Dataset management' portal then it'd appear as an option under "My Datasets" section in the dropdown as shown below:
![](https://files.readme.io/98e21f9-finetune3.png)
Prompt Configuration:
Upon selecting a dataset, you'll notice a section called as Prompt configuration.
This section needs to be modified based on your selected dataset.
For pre-curated HuggingFace Datasets:
If you select a pre-curated HuggingFace Dataset then no dataset prompt configuration is required as it is pre-filled and you can simply click on "Next" to proceed ahead.
For other HuggingFace Datasets or your own Dataset:
Replace the placeholders inside the square the brackets with the actual column names in your dataset that you want to use for Fine-tuning:
![](https://files.readme.io/2795b88-finetune4.png)
For example if our dataset looks like this:
We will replace these
- {replace with instruction column name} and
- {replace with response column name}
with
- {prompt} and
- {response}
respectively i.e. the column names of our target columns in the dataset.
Our updated data preparation window would look like this after making the changes:
![](https://files.readme.io/0a3cc52-finetune5.png)
And we are done!!! Our FineTuner will take care of the rest.
Simply click on "Next" and finalize your finetuning job request.
Updated 6 months ago