Fine‑tune a lightweight transformer (DistilBERT) on the Twitter‑Airline Sentiment dataset and benchmark it against a classical TF‑IDF + Logistic Regression baseline.
Modeldistilbert‑base‑uncased Training split 90 % of cleaned data (stratified) Validation split 10 % (held‑out during fine‑tuning) Test set Untouched split created in 04_baseline_model.ipynb Artifacts saved tomodels/distilbert_twitter/
c:\Users\justi\Anaconda3\envs\twitter-sentiment-env\Lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
2. Load Pre-Split Feather Data
Split
Rows
Columns
train
11 712
text (tweet)
val
1 464
text (tweet)
Assertion checks ensure that every tweet is paired with exactly one sentiment label.
Take-away: the data already passed cleaning and stratified splitting elsewhere in the pipeline—nothing to redo here.
Code
# Load pre‑made Feather splits def _load_xy_split(split: str):""" Return (X, y) for the given split. X : DataFrame with 'text' y : Series with 'label' """ X = pd.read_feather(PROC_DIR /f"X_{split}.ftr") # ['text'] y = pd.read_feather(PROC_DIR /f"y_{split}.ftr")["label"]return X, yX_train, y_train = _load_xy_split("train")X_val, y_val = _load_xy_split("val")for name, X, y in [("train", X_train, y_train), ("val", X_val, y_val)]:assertlist(X.columns) == ["text"]assert y.name =="label"assertlen(X) ==len(y)print(f"{name:5} | rows: {len(X):,}")display(X_train.head())display(y_train.head())
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=len(LABELS), id2label=id2label, label2id=label2id,)
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
5. Training Configuration
Hyper-param
Value
Epochs
2
Batch
16
LR
2 × 10⁻⁵
Weight decay
0.01
Eval / Save cadence
once per epoch
Best-model criterion
val_f1 (macro)
TrainingArguments keeps only the last 2 checkpoints to save disk space.
c:\Users\justi\Anaconda3\envs\twitter-sentiment-env\Lib\site-packages\torch\utils\data\dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used.
warnings.warn(warn_msg)
[1464/1464 1:15:00, Epoch 2/2]
Epoch
Training Loss
Validation Loss
Accuracy
F1
1
0.485000
0.410365
0.837432
0.787987
2
0.319500
0.419298
0.840164
0.798038
c:\Users\justi\Anaconda3\envs\twitter-sentiment-env\Lib\site-packages\torch\utils\data\dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used.
warnings.warn(warn_msg)
c:\Users\justi\Anaconda3\envs\twitter-sentiment-env\Lib\site-packages\torch\utils\data\dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used.
warnings.warn(warn_msg)