What Is Sentiment Analysis?
Sentiment analysis determines whether a piece of text expresses a positive, negative, or neutral opinion. It is one of the most common NLP tasks, used in product review analysis, social media monitoring, customer feedback processing, and brand tracking.
1
2
3
4
5
6
| from textblob import TextBlob
text = "This product is absolutely fantastic, I love it!"
blob = TextBlob(text)
print(f"Sentiment: {blob.sentiment}")
# Sentiment(polarity=0.56, subjectivity=0.73)
|
There are multiple ways to approach sentiment analysis, from simple rule-based methods to fine-tuned deep learning models. Each approach has different trade-offs in terms of accuracy, speed, and setup effort.
Approach 1: TextBlob for Quick Analysis
TextBlob is the simplest way to get sentiment scores. It uses a pre-built lexicon where words are mapped to polarity scores.
Installation
1
2
| pip install textblob
python -m textblob.download_corpora
|
Basic Usage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| from textblob import TextBlob
texts = [
"The food was delicious and the service was excellent.",
"Terrible experience. The room was dirty and staff was rude.",
"The hotel was okay. Nothing special but not bad either.",
"I absolutely love this place! Best vacation ever!",
"Worst purchase I've ever made. Complete waste of money.",
]
for text in texts:
blob = TextBlob(text)
polarity = blob.sentiment.polarity # -1 (negative) to 1 (positive)
subjectivity = blob.sentiment.subjectivity # 0 (objective) to 1 (subjective)
if polarity > 0.1:
label = "Positive"
elif polarity < -0.1:
label = "Negative"
else:
label = "Neutral"
print(f"[{label:>8}] (polarity={polarity:+.2f}) {text[:60]}")
|
Output:
1
2
3
4
5
| [Positive] (polarity=+0.75) The food was delicious and the service was excellent.
[Negative] (polarity=-0.65) Terrible experience. The room was dirty and staff was rude.
[ Neutral] (polarity=+0.05) The hotel was okay. Nothing special but not bad either.
[Positive] (polarity=+0.63) I absolutely love this place! Best vacation ever!
[Negative] (polarity=-0.52) Worst purchase I've ever made. Complete waste of money.
|
Analyzing a CSV of Reviews
1
2
3
4
5
6
7
8
9
10
11
12
13
| import pandas as pd
from textblob import TextBlob
df = pd.read_csv("reviews.csv") # Assumes a "text" column
df["polarity"] = df["text"].apply(lambda x: TextBlob(str(x)).sentiment.polarity)
df["subjectivity"] = df["text"].apply(lambda x: TextBlob(str(x)).sentiment.subjectivity)
df["label"] = df["polarity"].apply(
lambda p: "Positive" if p > 0.1 else ("Negative" if p < -0.1 else "Neutral")
)
print(df["label"].value_counts())
print(f"\nAverage polarity: {df['polarity'].mean():.3f}")
|
Limitations of TextBlob: It relies on word-level polarity scores and misses context, sarcasm, and negation patterns. βNot badβ gets a negative score because βbadβ is negative, even though the phrase is mildly positive.
Approach 2: VADER for Social Media Text
VADER (Valence Aware Dictionary and sEntiment Reasoner) is designed specifically for social media text. It handles emojis, slang, capitalization, and punctuation.
Installation
1
| pip install vaderSentiment
|
Basic Usage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
texts = [
"This movie is AMAZING!!!",
"the food was meh... not great tbh",
"lol this is SO bad it's funny π",
"Great product, fast shipping, would buy again π",
"DO NOT buy this. Absolute garbage.",
"It's not bad, actually kind of good.",
]
for text in texts:
scores = analyzer.polarity_scores(text)
compound = scores["compound"] # Normalized score from -1 to 1
if compound >= 0.05:
label = "Positive"
elif compound <= -0.05:
label = "Negative"
else:
label = "Neutral"
print(f"[{label:>8}] (compound={compound:+.3f}) {text}")
|
VADER returns four scores:
neg β Proportion of text that is negativeneu β Proportion that is neutralpos β Proportion that is positivecompound β Aggregated score from -1 to +1
1
2
3
4
| text = "The food was great but the service was terrible"
scores = analyzer.polarity_scores(text)
print(scores)
# {'neg': 0.268, 'neu': 0.444, 'pos': 0.288, 'compound': 0.0516}
|
VADER correctly identifies mixed sentiment here β both positive and negative elements are present, and the compound score reflects the slight overall positivity.
Batch Processing with VADER
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
def analyze_sentiment_vader(text):
scores = analyzer.polarity_scores(str(text))
compound = scores["compound"]
if compound >= 0.05:
return "Positive", compound
elif compound <= -0.05:
return "Negative", compound
return "Neutral", compound
df = pd.read_csv("tweets.csv")
results = df["text"].apply(analyze_sentiment_vader)
df["label"] = results.apply(lambda x: x[0])
df["compound"] = results.apply(lambda x: x[1])
print(df["label"].value_counts())
print(f"\nMost positive: {df.loc[df['compound'].idxmax(), 'text'][:80]}")
print(f"Most negative: {df.loc[df['compound'].idxmin(), 'text'][:80]}")
|
Approach 3: Custom Model with Scikit-Learn
When you have labeled data, you can train a custom classifier that learns patterns specific to your domain.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
| import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import classification_report
from sklearn.pipeline import Pipeline
# Load labeled data (assumes columns: "text", "label")
# Labels: "positive", "negative", "neutral"
df = pd.read_csv("labeled_reviews.csv")
print(f"Dataset size: {len(df)}")
print(df["label"].value_counts())
X_train, X_test, y_train, y_test = train_test_split(
df["text"], df["label"], test_size=0.2, random_state=42, stratify=df["label"]
)
# Build a pipeline
pipeline = Pipeline([
("tfidf", TfidfVectorizer(
max_features=10000,
ngram_range=(1, 2), # Unigrams and bigrams
min_df=2,
max_df=0.95,
sublinear_tf=True,
)),
("clf", LogisticRegression(
max_iter=1000,
C=1.0,
class_weight="balanced",
)),
])
# Cross-validation
cv_scores = cross_val_score(pipeline, X_train, y_train, cv=5, scoring="f1_macro")
print(f"Cross-validation F1: {cv_scores.mean():.3f} (+/- {cv_scores.std():.3f})")
# Train and evaluate
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
print(classification_report(y_test, y_pred))
|
Trying Multiple Classifiers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| from sklearn.svm import LinearSVC
from sklearn.naive_bayes import MultinomialNB
from sklearn.ensemble import RandomForestClassifier
classifiers = {
"Logistic Regression": LogisticRegression(max_iter=1000, C=1.0),
"Linear SVM": LinearSVC(max_iter=2000, C=1.0),
"Naive Bayes": MultinomialNB(alpha=0.1),
}
tfidf = TfidfVectorizer(max_features=10000, ngram_range=(1, 2), sublinear_tf=True)
X_train_tfidf = tfidf.fit_transform(X_train)
X_test_tfidf = tfidf.transform(X_test)
for name, clf in classifiers.items():
clf.fit(X_train_tfidf, y_train)
y_pred = clf.predict(X_test_tfidf)
from sklearn.metrics import f1_score
f1 = f1_score(y_test, y_pred, average="macro")
print(f"{name:25s} F1-macro: {f1:.3f}")
|
Saving and Loading the Model
1
2
3
4
5
6
7
8
9
| import joblib
# Save
joblib.dump(pipeline, "sentiment_model.joblib")
# Load
loaded_model = joblib.load("sentiment_model.joblib")
prediction = loaded_model.predict(["This product exceeded my expectations!"])
print(f"Prediction: {prediction[0]}")
|
Transformer models provide the highest accuracy for sentiment analysis. You can use pretrained models directly or fine-tune them.
Installation
1
| pip install transformers torch
|
Using a Pretrained Sentiment Model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| from transformers import pipeline
# Load a pretrained sentiment analysis model
sentiment_pipeline = pipeline(
"sentiment-analysis",
model="cardiffnlp/twitter-roberta-base-sentiment-latest",
device=0 # Use GPU; set to -1 for CPU
)
texts = [
"I absolutely love this new feature!",
"The update completely broke everything.",
"It works fine, nothing special.",
"Best customer support I've ever experienced!",
"This is the worst app on the store.",
]
results = sentiment_pipeline(texts)
for text, result in zip(texts, results):
print(f"[{result['label']:>8}] ({result['score']:.3f}) {text}")
|
Processing Large Datasets Efficiently
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| from transformers import pipeline
import pandas as pd
sentiment_pipeline = pipeline(
"sentiment-analysis",
model="cardiffnlp/twitter-roberta-base-sentiment-latest",
device=0,
batch_size=32,
)
df = pd.read_csv("reviews.csv")
texts = df["text"].tolist()
# Process in batches
results = sentiment_pipeline(texts, truncation=True, max_length=512)
df["transformer_label"] = [r["label"] for r in results]
df["transformer_score"] = [r["score"] for r in results]
print(df["transformer_label"].value_counts())
|
When the pretrained model does not match your domain, fine-tune it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
| from transformers import (
AutoModelForSequenceClassification,
AutoTokenizer,
TrainingArguments,
Trainer,
)
from datasets import Dataset
import numpy as np
from sklearn.metrics import accuracy_score, f1_score
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Prepare data
label_map = {"negative": 0, "neutral": 1, "positive": 2}
def tokenize(examples):
return tokenizer(
examples["text"],
truncation=True,
padding="max_length",
max_length=256,
)
# Assume df has "text" and "label" columns
df["label_id"] = df["label"].map(label_map)
dataset = Dataset.from_pandas(df[["text", "label_id"]].rename(columns={"label_id": "label"}))
dataset = dataset.train_test_split(test_size=0.2, seed=42)
dataset = dataset.map(tokenize, batched=True)
# Load model
model = AutoModelForSequenceClassification.from_pretrained(
model_name,
num_labels=3,
id2label={0: "negative", 1: "neutral", 2: "positive"},
label2id=label_map,
)
def compute_metrics(eval_pred):
logits, labels = eval_pred
preds = np.argmax(logits, axis=-1)
return {
"accuracy": accuracy_score(labels, preds),
"f1_macro": f1_score(labels, preds, average="macro"),
}
training_args = TrainingArguments(
output_dir="./sentiment_model",
num_train_epochs=3,
per_device_train_batch_size=16,
per_device_eval_batch_size=32,
learning_rate=2e-5,
weight_decay=0.01,
evaluation_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
metric_for_best_model="f1_macro",
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"],
eval_dataset=dataset["test"],
compute_metrics=compute_metrics,
)
trainer.train()
# Save the fine-tuned model
trainer.save_model("./sentiment_model_final")
tokenizer.save_pretrained("./sentiment_model_final")
|
Comparing All Approaches
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
| import time
from textblob import TextBlob
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from transformers import pipeline
test_texts = [
"This is the best product I've ever bought!",
"Absolutely terrible, waste of money.",
"It's okay, does what it says.",
"Not bad actually, pleasantly surprised.",
"DO NOT BUY. Broke after one day!!!",
"Love love love this!! π",
"meh, could be better tbh",
"The quality is outstanding, highly recommend.",
]
# TextBlob
print("=== TextBlob ===")
start = time.time()
for text in test_texts:
polarity = TextBlob(text).sentiment.polarity
label = "Pos" if polarity > 0.1 else ("Neg" if polarity < -0.1 else "Neu")
print(f" [{label}] {text[:50]}")
print(f" Time: {time.time() - start:.3f}s\n")
# VADER
print("=== VADER ===")
analyzer = SentimentIntensityAnalyzer()
start = time.time()
for text in test_texts:
compound = analyzer.polarity_scores(text)["compound"]
label = "Pos" if compound >= 0.05 else ("Neg" if compound <= -0.05 else "Neu")
print(f" [{label}] {text[:50]}")
print(f" Time: {time.time() - start:.3f}s\n")
# Transformer
print("=== Transformer ===")
sent_pipeline = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment-latest")
start = time.time()
results = sent_pipeline(test_texts, truncation=True)
for text, result in zip(test_texts, results):
print(f" [{result['label'][:3]}] {text[:50]}")
print(f" Time: {time.time() - start:.3f}s")
|
Comparison Summary
| Approach | Accuracy | Speed | Setup Effort | Best For |
|---|
| TextBlob | Low-Medium | Very fast | Minimal | Quick prototypes |
| VADER | Medium | Very fast | Minimal | Social media, informal text |
| Scikit-learn | Medium-High | Fast | Moderate (needs labeled data) | Domain-specific classification |
| Transformers | High | Slower | Moderate-High | Production systems needing accuracy |
Building a Sentiment Analysis API
Wrap your model in a REST API using FastAPI:
1
| pip install fastapi uvicorn
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
| from fastapi import FastAPI
from pydantic import BaseModel
from transformers import pipeline
app = FastAPI(title="Sentiment Analysis API")
# Load model at startup
sentiment_model = pipeline(
"sentiment-analysis",
model="cardiffnlp/twitter-roberta-base-sentiment-latest",
)
class TextInput(BaseModel):
text: str
class BatchInput(BaseModel):
texts: list[str]
class SentimentResult(BaseModel):
text: str
label: str
score: float
@app.post("/analyze", response_model=SentimentResult)
def analyze_single(input: TextInput):
result = sentiment_model(input.text, truncation=True, max_length=512)[0]
return SentimentResult(
text=input.text,
label=result["label"],
score=round(result["score"], 4),
)
@app.post("/analyze/batch", response_model=list[SentimentResult])
def analyze_batch(input: BatchInput):
results = sentiment_model(input.texts, truncation=True, max_length=512)
return [
SentimentResult(
text=text,
label=result["label"],
score=round(result["score"], 4),
)
for text, result in zip(input.texts, results)
]
@app.get("/health")
def health():
return {"status": "ok"}
|
Run the API:
1
| uvicorn sentiment_api:app --host 0.0.0.0 --port 8000
|
Test it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| import requests
# Single text
response = requests.post(
"http://localhost:8000/analyze",
json={"text": "This product is absolutely wonderful!"}
)
print(response.json())
# Batch
response = requests.post(
"http://localhost:8000/analyze/batch",
json={"texts": [
"Great product!",
"Terrible experience.",
"It was fine."
]}
)
for result in response.json():
print(f"[{result['label']}] ({result['score']:.3f}) {result['text']}")
|
Production Considerations
Caching. If you see repeated texts, cache the results. A simple dictionary cache or Redis can save inference time.
1
2
3
4
5
6
| from functools import lru_cache
@lru_cache(maxsize=10000)
def cached_sentiment(text: str) -> tuple:
result = sentiment_model(text, truncation=True, max_length=512)[0]
return result["label"], result["score"]
|
Input validation. Truncate very long texts, handle empty strings, and sanitize input to prevent unexpected behavior.
Monitoring. Log predictions along with input texts so you can audit the modelβs performance over time. Track the distribution of predicted labels β a sudden shift might indicate a data quality issue.
Model updates. Periodically evaluate your model on fresh labeled data. Sentiment patterns change over time (new slang, evolving language), and your model may need retraining.
Summary
Sentiment analysis ranges from simple lexicon-based methods (TextBlob, VADER) to trained machine learning models (scikit-learn) to transformer-based approaches (Hugging Face). TextBlob and VADER require no training data and work instantly but lack accuracy on domain-specific text. Scikit-learn models offer a good balance when you have labeled data. Transformers provide the highest accuracy and work well out of the box for general sentiment, with the option to fine-tune for your domain. For production, wrap your model in a FastAPI service with caching, input validation, and monitoring.