Sentiment Analysis — Explained

Definition

Sentiment analysis quantifies the writer's attitude. The simplest form is three-way classification (positive / negative / neutral); richer setups score intensity (-1 to +1), multiple dimensions (joy, anger, surprise), or aspect-based sentiment (positive about delivery, negative about price within the same review).

Approaches evolved:

- Lexicon-based (VADER, AFINN): each word has a preset sentiment score; sentence score = sum/average. Fast, explainable; weak on negation, irony, context. - Classical ML (Naive Bayes, logistic regression + TF-IDF): a classifier trained on labeled reviews. Solid baseline. - Pre-transformer deep learning (LSTM, CNN): captures context; needs data. - Transformer-based (BERT, RoBERTa, multilingual): today's default. Take a pre-trained model and fine-tune on 1K–10K labels for production-grade accuracy.

Open models exist for many languages. Label quality is what drives real performance. Morphology, irony, slang, regional use are perennial challenges in sentiment.

Real-world example

A shop processes 50K reviews a day; manual reading is impossible. Three models:

| Model | Accuracy | 1K reviews | Notes | |-------|----------|------------|-------| | VADER (lexicon) | 72% | <1 s | Misses irony | | Logistic + TF-IDF | 81% | 2 s | Decent baseline | | Fine-tuned BERT | 91% | 30 s | Catches context, negation |

Decision: BERT for the daily dashboard, VADER for live tooltip in customer service — speed traded for quality. Reviews where star count contradicts text (3 stars but mostly positive language) go to a manual queue; data quality matters more than tool choice.

Code examples

Hugging Face · ready-made modelPython

from transformers import pipeline

clf = pipeline(
    "sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english",
)

reviews = [
    "Loved the product, fast shipping too.",
    "Awful, would never recommend.",
    "It's okay, not as good as I hoped.",
]

for r in reviews:
    out = clf(r)[0]
    print(f"{out['label']:8s} ({out['score']:.2f})  →  {r}")

Train logistic regression on your own dataPython

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report

pipe = Pipeline([
    ("tfidf", TfidfVectorizer(
        ngram_range=(1, 2),
        min_df=2,
        max_features=50000,
    )),
    ("clf", LogisticRegression(max_iter=1000, C=1.0)),
])

pipe.fit(texts_train, labels_train)
preds = pipe.predict(texts_test)
print(classification_report(labels_test, preds))

feats = pipe.named_steps["tfidf"].get_feature_names_out()
coefs = pipe.named_steps["clf"].coef_[0]
top_pos = coefs.argsort()[-15:][::-1]
print("Most positive words:", [feats[i] for i in top_pos])

Common pitfalls

Losing negation

'not bad' is positive; 'not good' is negative. Lexicon methods often miss this; transformers handle it better but verify on your data.

Poor label quality

If sarcasm was labeled 'positive', the model learns that mistake. Track inter-annotator agreement; refine guidelines if it's low.

Single-number summaries

An overall 78% positive can hide a feature with 30% positive / 50% negative. Aspect-based sentiment surfaces what product needs.