AI Atlas
Beginner· ~2 min read#time-series#trend#seasonality

Time Series

Ordered observations over time

A sequence of observations indexed by time — daily sales, hourly temperature, millisecond heartbeats. The order matters and demands special analysis.

TIME SERIESyttrendA sum of trend, seasonality, and noise.
Definition

A time series is a sequence of observations recorded at (usually) regular time intervals. Unlike plain tabular data, order is meaningful: the value at time t is likely related to t-1. That structure brings both richness (seasonality, trend, autocorrelation) and constraints (no random splits).

Standard components: trend (long-term direction), seasonality (recurring weekly/monthly/yearly pattern), cyclical (multi-year business cycles), noise (random fluctuation). Decomposing the series into these is the starting point of analysis.

Time series can be stationary or non-stationary. Stationary preserves its statistics over time; non-stationary has trend or changing variance. Classical statistical models (ARIMA) require stationarity, often achieved via differencing, log transforms, or seasonal removal.

Modern ML approaches (gradient boosting, deep learning) don't require stationarity. You build a tabular dataset enriched with lag features, calendar features, exogenous variables, and treat it like a regression problem.

Analogy

A heartbeat trace. A single value tells you little; the rhythm over the previous minutes, the wave shape, any irregularity — those tell a doctor a lot. Order carries information. Shuffle the data and information is destroyed. Time-series analysis works the same way.

Real-world example

A shop's three years of daily orders are a time series. Visible patterns:

1. Trend: ~18% annual growth. 2. Yearly seasonality: December peak (holidays), Jan/Feb trough. 3. Weekly seasonality: Mon/Tue slightly down, Wed/Thu normal, Fri-Sun peaks. 4. Event effects: Black Friday, religious holidays, Valentine's Day spikes.

Decompose, then model each component: trend with linear/ exponential, seasonality with Fourier or dummies, holidays with indicator variables. Prophet does this automatically; a LightGBM model gets these via lag + calendar + holiday features you build by hand.

Code examples
Exploration and decompositionPython
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller
import matplotlib.pyplot as plt

df = pd.read_csv("orders.csv", parse_dates=["date"], index_col="date")

df["orders"].plot(figsize=(12, 4), title="Daily orders")
plt.savefig("series.png")

result = seasonal_decompose(df["orders"], model="additive", period=7)
result.plot()
plt.savefig("decomposition.png")

# Stationarity test (ADF)
stat, pvalue, *_ = adfuller(df["orders"].dropna())
print(f"ADF p-value: {pvalue:.4f}")
# p < 0.05 → stationary
# p > 0.05 → non-stationary; differencing may help

df["rolling_mean"] = df["orders"].rolling(30).mean()
df["rolling_std"] = df["orders"].rolling(30).std()
Lag and calendar featuresPython
import pandas as pd

df = pd.read_csv("orders.csv", parse_dates=["date"]).sort_values("date")

for lag in [1, 7, 14, 30]:
    df[f"lag_{lag}"] = df["orders"].shift(lag)

df["dow"] = df["date"].dt.dayofweek
df["month"] = df["date"].dt.month
df["is_weekend"] = df["dow"].isin([5, 6]).astype(int)
df["day_of_year"] = df["date"].dt.dayofyear

df["ma_7"] = df["orders"].rolling(7).mean()
df["ma_30"] = df["orders"].rolling(30).mean()
When to use
  • Timestamped, order-meaningful data
  • Demand, finance, sensor telemetry
  • Anomaly detection — deviations from expectation
  • Forecasting backbone
When not to use
  • Tabular data with no temporal dimension
  • Very short series — too little structure
  • Highly irregular timestamps — resample first
Common pitfalls

Random splitting

Random train/test puts the future in training and the past in validation. Inflated scores, broken production. Always chronological splits.

Missing dates

Daily data should have a row per day. Missing days break lag features. Resample('D') and fill gaps explicitly.

One model fits all

1000 SKUs each with their own series — one model vs 1000 models is rarely optimal. Look at hierarchical / global modeling. Each has tradeoffs.