Dataset Preparation Examples

This guide gives a couple of use-case driven examples for preparing dataset while using MonsterAPI's no-code LLM Finetuner

Text Classification Use-case

Let us consider an example of finetuning a Large Language Model (LLM) for text classification use-case.

After we have chosen an LLM, we select a task called "Text Classification":

For the dataset option, we can choose either:

  • A pre-curated dataset, or
  • Our own dataset (checkout Managed Datasets section), or
  • Other option for specifying our own choice of HuggingFace dataset.

Let us select other option and specify a dataset called "sms_spam" for text classification:

This is how the "sms_spam" dataset looks like:

This is how we specify the target columns in our dataset preparation window (basically we changed the placeholders in square brackets and set them as the target column names of our dataset):

That's all you need to do for preparing your dataset for finetuning on MonsterAPI.

Once you have confirmed that all the column names are correct as per your chosen HuggingFace dataset, click "Next" to proceed to the next step where you'll define the hyper parameters.



Summary Generation Use-case

Let us consider an example of finetuning a Large Language Model (LLM) for summary generation use-case.

After we have chosen an LLM, we select a task called "Summary-Generator":

For the dataset option, we can choose either:

  • A pre-curated dataset, or
  • Our own dataset (checkout Managed Datasets section), or
  • Other option for specifying our own choice of HuggingFace dataset.

Let us select other option and specify a dataset called "xsum" for summary generation:

This is how the "xsum" dataset looks like:

This is how we specify the target columns in our dataset preparation window (basically we changed the placeholders in square brackets and set them as the target column names of our dataset):

That's all you need to do for preparing your dataset for finetuning on MonsterAPI.

Once you have confirmed that all the column names are correct as per your chosen HuggingFace dataset, click "Next" to proceed to the next step where you'll define the hyper parameters.

For any questions, don't hesitate to reach us out