Quicktour
🤗 PEFT contains parameter-efficient finetuning methods for training large pretrained models. The traditional paradigm is to finetune all of a model’s parameters for each downstream task, but this is becoming exceedingly costly and impractical because of the enormous number of parameters in models today. Instead, it is more efficient to train a smaller number of prompt parameters or use a reparametrization method like low-rank adaptation (LoRA) to reduce the number of trainable parameters.
This quicktour will show you 🤗 PEFT’s main features and help you train large pretrained models that would typically be inaccessible on consumer devices. You’ll see how to train the 1.2B parameter bigscience/mt0-large
model with LoRA to generate a classification label and use it for inference.
PeftConfig
Each 🤗 PEFT method is defined by a PeftConfig class that stores all the important parameters for building a PeftModel.
Because you’re going to use LoRA, you’ll need to load and create a LoraConfig class. Within LoraConfig
, specify the following parameters:
- the
task_type
, or sequence-to-sequence language modeling in this case inference_mode
, whether you’re using the model for inference or notr
, the dimension of the low-rank matriceslora_alpha
, the scaling factor for the low-rank matriceslora_dropout
, the dropout probability of the LoRA layers
from peft import LoraConfig, TaskType
peft_config = LoraConfig(task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)
💡 See the LoraConfig reference for more details about other parameters you can adjust.
PeftModel
A PeftModel is created by the get_peft_model()
function. It takes a base model - which you can load from the 🤗 Transformers library - and the PeftConfig containing the instructions for how to configure a model for a specific 🤗 PEFT method.
Start by loading the base model you want to finetune.
from transformers import AutoModelForSeq2SeqLM
model_name_or_path = "bigscience/mt0-large"
tokenizer_name_or_path = "bigscience/mt0-large"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)
Wrap your base model and peft_config
with the get_peft_model
function to create a PeftModel. To get a sense of the number of trainable parameters in your model, use the print_trainable_parameters
method. In this case, you’re only training 0.19% of the model’s parameters! 🤏
from peft import get_peft_model
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
"output: trainable params: 2359296 || all params: 1231940608 || trainable%: 0.19151053100118282"
That is it 🎉! Now you can train the model using the 🤗 Transformers Trainer, 🤗 Accelerate, or any custom PyTorch training loop.
Save and load a model
After your model is finished training, you can save your model to a directory using the save_pretrained function. You can also save your model to the Hub (make sure you log in to your Model Database account first) with the push_to_hub function.
model.save_pretrained("output_dir")
# if pushing to Hub
from huggingface_hub import notebook_login
notebook_login()
model.push_to_hub("my_awesome_peft_model")
This only saves the incremental 🤗 PEFT weights that were trained, meaning it is super efficient to store, transfer, and load. For example, this bigscience/T0_3B
model trained with LoRA on the twitter_complaints
subset of the RAFT dataset only contains two files: adapter_config.json
and adapter_model.bin
. The latter file is just 19MB!
Easily load your model for inference using the from_pretrained function:
from transformers import AutoModelForSeq2SeqLM
+ from peft import PeftModel, PeftConfig
+ peft_model_id = "smangrul/twitter_complaints_bigscience_T0_3B_LORA_SEQ_2_SEQ_LM"
+ config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path)
+ model = PeftModel.from_pretrained(model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = model.to(device)
model.eval()
inputs = tokenizer("Tweet text : @HondaCustSvc Your customer service has been horrible during the recall process. I will never purchase a Honda again. Label :", return_tensors="pt")
with torch.no_grad():
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=10)
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0])
'complaint'
Easy loading with Auto classes
If you have saved your adapter locally or on the Hub, you can leverage the AutoPeftModelForxxx
classes and load any PEFT model with a single line of code:
- from peft import PeftConfig, PeftModel
- from transformers import AutoModelForCausalLM
+ from peft import AutoPeftModelForCausalLM
- peft_config = PeftConfig.from_pretrained("ybelkada/opt-350m-lora")
- base_model_path = peft_config.base_model_name_or_path
- transformers_model = AutoModelForCausalLM.from_pretrained(base_model_path)
- peft_model = PeftModel.from_pretrained(transformers_model, peft_config)
+ peft_model = AutoPeftModelForCausalLM.from_pretrained("ybelkada/opt-350m-lora")
Currently, supported auto classes are: AutoPeftModelForCausalLM
, AutoPeftModelForSequenceClassification
, AutoPeftModelForSeq2SeqLM
, AutoPeftModelForTokenClassification
, AutoPeftModelForQuestionAnswering
and AutoPeftModelForFeatureExtraction
. For other tasks (e.g. Whisper, StableDiffusion), you can load the model with:
- from peft import PeftModel, PeftConfig, AutoPeftModel
+ from peft import AutoPeftModel
- from transformers import WhisperForConditionalGeneration
- model_id = "smangrul/openai-whisper-large-v2-LORA-colab"
peft_model_id = "smangrul/openai-whisper-large-v2-LORA-colab"
- peft_config = PeftConfig.from_pretrained(peft_model_id)
- model = WhisperForConditionalGeneration.from_pretrained(
- peft_config.base_model_name_or_path, load_in_8bit=True, device_map="auto"
- )
- model = PeftModel.from_pretrained(model, peft_model_id)
+ model = AutoPeftModel.from_pretrained(peft_model_id)
Next steps
Now that you’ve seen how to train a model with one of the 🤗 PEFT methods, we encourage you to try out some of the other methods like prompt tuning. The steps are very similar to the ones shown in this quickstart; prepare a PeftConfig for a 🤗 PEFT method, and use the get_peft_model
to create a PeftModel from the configuration and base model. Then you can train it however you like!
Feel free to also take a look at the task guides if you’re interested in training a model with a 🤗 PEFT method for a specific task such as semantic segmentation, multilingual automatic speech recognition, DreamBooth, and token classification.