Fine-tuning GPT2

แžแžพแžขแŸ’แžœแžธแž‘แŸ…แž‡แžถ Fine-tuning?

Fine-tuning แž‚แžบแž‡แžถแž”แž…แŸ’แž…แŸแž€แž‘แŸแžŸแž˜แžฝแž™แž€แŸ’แž“แžปแž„ Machine Learning แžŠแŸ‚แž›แž‚แŸแž”แŸ’แžšแžพแžŠแžพแž˜แŸ’แž”แžธแž€แŸ‚แž›แž˜แŸ’แžขแž˜แŸ‰แžผแžŠแŸ‚แž›แžŠแŸ‚แž›แž”แžถแž“ Train แžšแžฝแž…แž แžพแž™ แžŠแŸ„แž™แž”แŸ’แžšแžพแž”แŸ’แžšแžถแžŸแŸ‹แž‘แžทแž“แŸ’แž“แž“แŸแž™แž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แžŸแž˜แŸ’แžšแžถแž”แŸ‹แž—แžถแžšแž€แžทแž…แŸ’แž…แžฌแžŠแŸ‚แž“แž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แž˜แžฝแž™แŸ” แžœแžถแž‡แžถแžŠแŸ†แžŽแžพแžšแž€แžถแžšแž“แŸƒแž€แžถแžšแžŸแž˜แŸ’แžšแž”แžŸแž˜แŸ’แžšแžฝแž›แž˜แŸ‰แžผแžŠแŸ‚แž›แžŠแŸ‚แž›แž˜แžถแž“แžŸแŸ’แžšแžถแž”แŸ‹ แžŠแžพแž˜แŸ’แž”แžธแžฑแŸ’แž™แžœแžถแžขแžถแž…แžŠแŸ†แžŽแžพแžšแž€แžถแžšแž”แžถแž“แž›แŸ’แžขแž”แŸ’แžšแžŸแžพแžšแž›แžพแž—แžถแžšแž€แžทแž…แŸ’แž…แž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แžŽแžถแž˜แžฝแž™แŸ”

แž…แŸ†แžŽแžปแž…แžŸแŸ†แžแžถแž“แŸ‹แŸ—แž“แŸƒ Fine-tuning:

  • แž€แžถแžšแž”แŸ’แžšแžพแž”แŸ’แžšแžถแžŸแŸ‹แž˜แŸ‰แžผแžŠแŸ‚แž›แžŠแŸ‚แž›แž”แžถแž“แž”แžŽแŸ’แžแžปแŸ‡แž”แžŽแŸ’แžแžถแž›แž‡แžถแž˜แžปแž“: Fine-tuning แž…แžถแž”แŸ‹แž•แŸ’แžแžพแž˜แž‡แžถแž˜แžฝแž™แž˜แŸ‰แžผแžŠแŸ‚แž›แžŠแŸ‚แž›แž”แžถแž“แž”แžŽแŸ’แžแžปแŸ‡แž”แžŽแŸ’แžแžถแž›แžšแžฝแž…แž แžพแž™แž›แžพแžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™แž’แŸ†แŸ”
  • แž€แžถแžšแž”แžŽแŸ’แžแžปแŸ‡แž”แžŽแŸ’แžแžถแž›แž”แž“แŸ’แžแŸ‚แž˜: แž˜แŸ‰แžผแžŠแŸ‚แž›แž“แŸแŸ‡แžแŸ’แžšแžผแžœแž”แžถแž“แž”แžŽแŸ’แžแžปแŸ‡แž”แžŽแŸ’แžแžถแž›แž”แž“แŸ’แžแŸ‚แž˜แž›แžพแžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™แžแžผแž…แž‡แžถแž„แž˜แžปแž“ แž”แŸ‰แžปแž“แŸ’แžแŸ‚แž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แžŸแž˜แŸ’แžšแžถแž”แŸ‹แž—แžถแžšแž€แžทแž…แŸ’แž…แžฌแžŠแŸ‚แž“แž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แŸ”
  • แž€แžถแžšแžšแž€แŸ’แžŸแžถแž…แŸ†แžŽแŸแŸ‡แžŠแžนแž„แžŠแžพแž˜: Fine-tuning แžšแž€แŸ’แžŸแžถแž…แŸ†แžŽแŸแŸ‡แžŠแžนแž„แž‘แžผแž‘แŸ…แžŠแŸ‚แž›แž˜แŸ‰แžผแžŠแŸ‚แž›แž”แžถแž“แžšแŸ€แž“แž–แžธแž˜แžปแž“ แžแžŽแŸˆแž–แŸแž›แžŠแŸ‚แž›แž€แŸ‚แžŸแž˜แŸ’แžšแžฝแž›แžœแžถแžŸแž˜แŸ’แžšแžถแž”แŸ‹แž—แžถแžšแž€แžทแž…แŸ’แž…แžแŸ’แž˜แžธแŸ”
  • แž€แžถแžšแžŸแž“แŸ’แžŸแŸ†แžŸแŸ†แž…แŸƒแž’แž“แž’แžถแž“: แžœแžถแžแŸ’แžšแžผแžœแž€แžถแžšแž–แŸแž›แžœแŸแž›แžถ แž“แžทแž„แž‘แžทแž“แŸ’แž“แž“แŸแž™แžแžทแž…แž‡แžถแž„แž€แžถแžšแž”แžŽแŸ’แžแžปแŸ‡แž”แžŽแŸ’แžแžถแž›แž˜แŸ‰แžผแžŠแŸ‚แž›แž–แžธแžŸแžผแž“แŸ’แž™แŸ”

แž แŸแžแžปแžขแŸ’แžœแžธแž”แžถแž“แž‡แžถ Fine-tuning แž˜แžถแž“แžŸแžถแžšแŸˆแžŸแŸ†แžแžถแž“แŸ‹?

  • แž€แžถแžšแž”แž„แŸ’แž€แžพแž“แž”แŸ’แžšแžŸแžทแž‘แŸ’แž’แž—แžถแž–: แžœแžถแžขแžถแž…แž”แž„แŸ’แž€แžพแž“แž”แŸ’แžšแžŸแžทแž‘แŸ’แž’แž—แžถแž–แžšแž”แžŸแŸ‹แž˜แŸ‰แžผแžŠแŸ‚แž›แž›แžพแž—แžถแžšแž€แžทแž…แŸ’แž…แž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แŸ”
  • แž€แžถแžšแžŸแž˜แŸ’แžšแž”แžแŸ’แž›แžฝแž“แž‘แŸ…แž“แžนแž„แžŠแŸ‚แž“: แžœแžถแžขแž“แžปแž‰แŸ’แž‰แžถแžแžฑแŸ’แž™แž˜แŸ‰แžผแžŠแŸ‚แž›แžŸแž˜แŸ’แžšแž”แžแŸ’แž›แžฝแž“แž‘แŸ…แž“แžนแž„แž—แžถแžŸแžถ แžฌแž–แžถแž€แŸ’แž™แž–แŸแž…แž“แŸแž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แž“แŸƒแžงแžŸแŸ’แžŸแžถแž แž€แž˜แŸ’แž˜แžฌแžŠแŸ‚แž“แžŽแžถแž˜แžฝแž™แŸ”
  • แž€แžถแžšแž”แŸ’แžšแžพแž”แŸ’แžšแžถแžŸแŸ‹แž’แž“แž’แžถแž“แžแžทแž…: แžœแžถแž˜แžถแž“แž”แŸ’แžšแžŸแžทแž‘แŸ’แž’แž—แžถแž–แž‡แžถแž„แž€แžถแžšแž”แžŽแŸ’แžแžปแŸ‡แž”แžŽแŸ’แžแžถแž›แž˜แŸ‰แžผแžŠแŸ‚แž›แž–แžธแžŸแžผแž“แŸ’แž™แŸ”
  • แž€แžถแžšแžขแž“แžปแžœแžแŸ’แžแž›แžฟแž“: แžœแžถแžขแž“แžปแž‰แŸ’แž‰แžถแžแžฑแŸ’แž™แžขแŸ’แž“แž€แžขแž—แžทแžœแžŒแŸ’แžแž“แŸแž”แž„แŸ’แž€แžพแžแž˜แŸ‰แžผแžŠแŸ‚แž›แžŠแŸ‚แž›แž˜แžถแž“แž”แŸ’แžšแžŸแžทแž‘แŸ’แž’แž—แžถแž–แžแŸ’แž–แžŸแŸ‹แž€แŸ’แž“แžปแž„แžšแž™แŸˆแž–แŸแž›แžแŸ’แž›แžธแŸ”
Fine-tuning แžแŸ’แžšแžผแžœแž”แžถแž“แž”แŸ’แžšแžพแž”แŸ’แžšแžถแžŸแŸ‹แž™แŸ‰แžถแž„แž‘แžผแž›แŸ†แž‘แžผแž›แžถแž™แž€แŸ’แž“แžปแž„แž€แžถแžšแžšแŸ€แž“แž˜แŸ‰แžถแžŸแŸŠแžธแž“ แž‡แžถแž–แžทแžŸแŸแžŸแž€แŸ’แž“แžปแž„ Natural Language Processing (NLP) แžŸแž˜แŸ’แžšแžถแž”แŸ‹แž—แžถแžšแž€แžทแž…แŸ’แž…แžŠแžผแž…แž‡แžถแž€แžถแžšแžœแžทแž—แžถแž‚แžขแžถแžšแž˜แŸ’แž˜แžŽแŸ แž€แžถแžšแž†แŸ’แž›แžพแž™แžŸแŸ†แžŽแžฝแžš แž“แžทแž„แž€แžถแžšแž”แž€แž”แŸ’แžšแŸ‚แž—แžถแžŸแžถแŸ”

แžงแž‘แžถแž แžšแžŽแŸแž€แžผแžŠ

  • แž€แŸ’แž“แžปแž„แž€แžถแžšแžŽแŸ‚แž“แžถแŸ†แž“แŸแŸ‡ แž™แžพแž„แž“แžนแž„แž†แŸ’แž›แž„แž€แžถแžแŸ‹แžŠแŸ†แžŽแžพแžšแž€แžถแžšแž“แŸƒแž€แžถแžš Fine-tune แž“แžผแžœแž˜แŸ‰แžผแžŠแŸ‚แž› GPT-2 แž›แžพแžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™แž•แŸ’แž‘แžถแž›แŸ‹แžแŸ’แž›แžฝแž“ แžŠแŸ„แž™แž”แŸ’แžšแžพแž”แŸ’แžšแžถแžŸแŸ‹ Library Hugging Face TransformersแŸ”
  • แžŸแž˜แŸ’แžšแžถแž”แŸ‹แžงแž‘แžถแž แžšแžŽแŸแž“แŸแŸ‡ แž™แžพแž„แž“แžนแž„แž”แŸ’แžšแžพแžŸแŸ†แžŽแžปแŸ†แžšแž„แž“แŸƒแžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™ “emotion” แž–แžธ Library แžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™แžšแž”แžŸแŸ‹ Hugging FaceแŸ” แžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™แž“แŸแŸ‡แž˜แžถแž“แžŸแžถแžšแžขแžแŸ’แžแž”แž‘แžแŸ’แž›แžธแŸ—แžŠแŸ‚แž›แž”แžถแž“แžŠแžถแž€แŸ‹แžŸแŸ’แž›แžถแž€แž‡แžถแž˜แžฝแž™แžขแžถแžšแž˜แŸ’แž˜แžŽแŸแŸ”
  • แž‡แžถแžŠแŸ†แž”แžผแž„ แžŸแžผแž˜แžŠแŸ†แžกแžพแž„ Library แžŠแŸ‚แž›แžแŸ’แžšแžผแžœแž€แžถแžšแŸ–
!pip install transformers datasets torch
!pip install transformers[torch] -U
!pip install accelerate -U

แž“แŸแŸ‡แž‚แžบแž‡แžถแžŸแŸ’แž‚แŸ’แžšแžธแž” Python แžŠแŸ‚แž›แž”แž„แŸ’แž แžถแž‰แž–แžธแžŠแŸ†แžŽแžพแžšแž€แžถแžš Fine-tuningแŸ–

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer, DataCollatorForLanguageModeling
from transformers import Trainer, TrainingArguments
from datasets import load_dataset

# Load pre-trained model and tokenizer
model_name = "gpt2"
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

# Set padding token
tokenizer.pad_token = tokenizer.eos_token

# Load and preprocess the dataset
dataset = load_dataset("emotion", split="train")

def preprocess_function(examples):
    return tokenizer([f"Emotion: {text}" for text in examples["text"]], truncation=True, padding="max_length", max_length=64)

tokenized_dataset = dataset.map(preprocess_function, batched=True, remove_columns=dataset.column_names)

# Convert to PyTorch tensors
tokenized_dataset.set_format("torch")

# Create DataCollator
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

# Define training arguments
training_args = TrainingArguments(
    output_dir="./gpt2-emotion-finetuned",
    overwrite_output_dir=True,
    num_train_epochs=3,
    per_device_train_batch_size=4,
    save_steps=10_000,
    save_total_limit=2,
)

# Create Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    data_collator=data_collator,
    train_dataset=tokenized_dataset,
)

# Fine-tune the model
trainer.train()

# Save the fine-tuned model
model.save_pretrained("./gpt2-emotion-finetuned")
tokenizer.save_pretrained("./gpt2-emotion-finetuned")

แž”แž“แŸ’แž‘แžถแž”แŸ‹แž–แžธแž€แžถแžš Fine-tune แžขแŸ’แž“แž€แžขแžถแž…แž”แŸ’แžšแžพแž˜แŸ‰แžผแžŠแŸ‚แž›แžŠแžพแž˜แŸ’แž”แžธแž”แž„แŸ’แž€แžพแžแžขแžแŸ’แžแž”แž‘แžŠแŸ„แž™แž•แŸ’แžขแŸ‚แž€แž›แžพแžขแžถแžšแž˜แŸ’แž˜แžŽแŸแŸ–

# Load the fine-tuned model and tokenizer
fine_tuned_model = GPT2LMHeadModel.from_pretrained("./gpt2-emotion-finetuned")
fine_tuned_tokenizer = GPT2Tokenizer.from_pretrained("./gpt2-emotion-finetuned")

# แž”แž„แŸ’แž€แžพแžแžขแžแŸ’แžแž”แž‘
prompt = "Emotion: i didnt feel well"
input_ids = fine_tuned_tokenizer.encode(prompt, return_tensors="pt")
output = fine_tuned_model.generate(input_ids, max_length=100, num_return_sequences=1, no_repeat_ngram_size=2)

generated_text = fine_tuned_tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
Output
Emotion: i didnt feel well enough to go to the doctor but i was feeling good enough that i could go for a walk and not feel so bad about it all the time and that was good for me too because i had been feeling pretty good about my health for the past week and a half and i just needed to get through it without feeling like i got a bad grade or something and then i would be fine again and again until i went to my doctor and got my results and it was

แžงแž‘แžถแž แžšแžŽแŸแž“แŸแŸ‡แž”แž„แŸ’แž แžถแž‰แž–แžธแžšแž”แŸ€แž”แž’แŸ’แžœแžพแžฑแŸ’แž™แž”แŸ’แžšแžŸแžพแžšแžกแžพแž„แž“แžผแžœ GPT-2 แž›แžพแžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™ “emotion” แž“แžทแž„แž”แŸ’แžšแžพแžœแžถแžŠแžพแž˜แŸ’แž”แžธแž”แž„แŸ’แž€แžพแžแžขแžแŸ’แžแž”แž‘แžŠแŸ‚แž›แž•แŸ’แžขแŸ‚แž€แž›แžพแžขแžถแžšแž˜แŸ’แž˜แžŽแŸแŸ” แžŸแžผแž˜แž…แž„แž…แžถแŸ†แžแžถแžแŸ’แžšแžผแžœแž€แŸ‚แžŸแž˜แŸ’แžšแžฝแž›แž”แŸ‰แžถแžšแŸ‰แžถแž˜แŸ‰แŸ‚แžแŸ’แžš แž“แžทแž„แž‘แŸ†แž แŸ†แžŸแŸ†แžŽแžปแŸ†แž‘แžทแž“แŸ’แž“แž“แŸแž™แž‘แŸ…แžแžถแž˜แžแž˜แŸ’แžšแžผแžœแž€แžถแžšแž‡แžถแž€แŸ‹แž›แžถแž€แŸ‹แžšแž”แžŸแŸ‹แžขแŸ’แž“แž€ แž“แžทแž„แž’แž“แž’แžถแž“แž‚แžŽแž“แžถแžŠแŸ‚แž›แž˜แžถแž“แŸ”

References
1. Hugging Face Transformers Library Documentation:
   - Main documentation: https://huggingface.co/transformers/
   - Fine-tuning tutorial: https://huggingface.co/transformers/training.html

2. GPT-2 Model:
   - Model card: https://huggingface.co/gpt2

3. Datasets Library:
   - Main documentation: https://huggingface.co/docs/datasets/
   - Emotion dataset: https://huggingface.co/datasets/emotion

- Hugging Face's language modeling example:
https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling

- Fine-tuning GPT-2 for text generation tutorial:
  https://huggingface.co/blog/how-to-generate