Logistic Regression
Linear classifier with calibrated probabilities
Despite the name, it's a classification algorithm — it models the probability of a class via the sigmoid function on a linear combination of features.
Logistic regression takes the unbounded numeric output of linear regression and squashes it through the sigmoid (logistic) function into a 0–1 probability — say, the probability of "spam". A threshold (often 0.5) turns it into a binary decision. The multinomial version uses softmax to do the same on N classes.
Training optimizes log-likelihood (equivalently, minimizes cross-entropy) instead of squared error — which gives much better calibrated probabilities.
Logistic regression is the workhorse of every domain that demands explainability: credit scoring, insurance risk, clinical decision support. Coefficients are interpreted via odds ratios — "a one-unit increase in this feature multiplies the odds of the event by exp(β)."
Like a doctor reasoning "is this patient at high risk of a heart attack?" — weighing age, blood pressure, cholesterol, smoking, family history, each with its own mental weight. They sum them and produce a probability. Below 20% they relax; above, they order more tests. Logistic regression is the math of that reasoning — it learns the weights from data, the door to interpretation stays open for humans.
An e-commerce site wants to send discount emails to cart abandoners. The model output: "probability this user comes back to complete the purchase within 24 hours". Low probability → send the email; high → skip (they'd return anyway, the discount is wasted).
Logistic regression turns 18 features (device, browse depth, cart value, prior orders…) into a probability. A threshold of 0.35 selects the email cohort. Showing coefficients to marketing yields lines like "having shopped premium multiplies the return-odds by 2.4" — the model produces insight, not just predictions.
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import roc_auc_score
import numpy as np
pipe = Pipeline([
("scaler", StandardScaler()),
("clf", LogisticRegression(max_iter=500, C=1.0)),
])
pipe.fit(X_train, y_train)
probs = pipe.predict_proba(X_test)[:, 1]
# ROC-AUC evaluates the model independent of the threshold
print(f"ROC-AUC: {roc_auc_score(y_test, probs):.3f}")
# Odds ratio interpretation
coefs = pipe.named_steps["clf"].coef_[0]
for name, c in zip(feature_names, coefs):
print(f"{name}: coef={c:+.3f}, odds ratio={np.exp(c):.3f}")- Fast, explainable baseline for binary classification
- Calibrated probabilities matter, not just the predicted class
- Regulated/corporate environments where decisions must be defended
- Roughly linearly separable data
- The class boundary is clearly nonlinear — try trees or kernel SVM
- Very high-dimensional, complex relationships and accuracy is paramount
- Heavily imbalanced classes without class weighting
Skipping feature scaling
Logistic regression with a regularizer is sensitive to feature scale; without scaling, regularization unfairly penalizes some features and convergence is slow. Always StandardScaler.
Leaving threshold at 0.5
On imbalanced data 0.5 is rarely right. If the positive class is 5%, the optimal threshold is usually much lower. Pick from the ROC and precision-recall curves.
Trusting linear separability
Logistic regression draws a linear boundary. If groups intertwine, it fails. Always validate the assumption.