HMM Regime Terminal – Project Overview

🏗️ Architecture

The project follows a clean three-layer architecture. Each file has a single responsibility and they compose through simple Python imports.

📡 yfinance

→

data_loader.py

→

backtester.py

→

dashboard.py / scanner.py

→

🌐 Browser UI / CLI

data_loader.py

Data Layer

Downloads daily OHLCV data via yf.Ticker.history() (no end date — returns live current bar). Handles timezone normalisation (UTC), indicator warmup (+150 calendar days), and engineers four HMM input features.

backtester.py

Core Engine

Fits the GMM-HMM, decodes regimes, computes 10 technical confirmations, and simulates the strategy. Internally split into _prepare() (HMM fit + indicators) and _run_simulation() (trade loop + metrics). Exposes optimize_params() for grid-search without refitting.

scanner.py

Multi-Index Screener

Scans S&P 500, Nasdaq 100, or Russell 2000 constituents in parallel for Bull Run entries. Ranks candidates by composite entry-quality score. Integrates into the dashboard as a dedicated page.

dashboard.py

UI Layer

Streamlit "Regime Terminal" with two pages: Backtester (live signals, TradingView-style chart, scorecard, equity curve, trade log) and Stock Screener (multi-index Bull entry scanner).

requirements.txt

Dependencies

yfinance, hmmlearn, pandas, numpy, scikit-learn, streamlit, plotly, pandas-ta. Charting uses lightweight-charts v4.2.1 via CDN (no install needed). Install with pip install -r requirements.txt.

🧠 GMM-HMM Engine

The core model is a 7-state Gaussian Mixture Model Hidden Markov Model (hmmlearn.hmm.GMMHMM) with 2 Gaussian mixture components per state and full covariance matrices. Unlike a plain Gaussian HMM (one Gaussian per state), each GMM-HMM state models observations as a weighted blend of 2 Gaussians — capturing multi-modal within-state behaviour such as a Bull Run that sometimes produces moderate gains and sometimes explosive ones.

Parameter	Value	Rationale
`n_components` (states)	7	Rich enough to distinguish micro-regimes without overfitting
`n_mix` (Gaussians / state)	2	Captures multi-modal within-state distributions
`covariance_type`	`"full"`	Full 4×4 covariance matrix — captures feature correlations
`n_iter`	2000	Generous EM budget to ensure convergence
Features fitted on	4 columns	returns, range, vol_change, trend_return

Auto-labelling (Option B)

After fitting, each state's mean return is ranked. The top 3 states by mean return are labelled Bull Run, the bottom 2 are labelled Bear/Crash, and the remaining 2 become Neutral/Transition. No manual tuning required.

Why regime labels change with lookback period: The HMM is refit from scratch on every lookback window. Different training data → different emission parameters → different state-to-regime mappings. Additionally, the Viterbi decoder is globally optimal over the full sequence, so adding older bars changes the most-likely path and can alter the label on the final bar. Regime labels are in-sample statistical artefacts, not forward predictions.

Bull Run States

3 / 7

Bear/Crash States

2 / 7

Neutral States

2 / 7

Mixture Components

2 / state

🔢 7 Hidden Sub-States NEW

The 7 internal HMM states are ranked by mean return (lowest → highest) and collapsed into three tradeable regimes. The dashboard displays the current sub-state inside the Detected Regime box (e.g. "State 6 · Steady Uptrend") so you can see which specific micro-regime is active within the broader Bull/Bear/Neutral classification.

#	Sub-State Name	Regime	Characteristics
1	Crash / Panic	Bear/Crash	Sharpest drawdowns, highest volatility spike, heavy sell volume. Macro risk-off events, flash crashes, or liquidation cascades.
2	Bear / Sustained Decline	Bear/Crash	Persistent negative returns with elevated but stable volatility. Prolonged downtrend — sellers in control, low buying interest.
3	Bearish Consolidation	Neutral/Transition	Mildly negative to flat returns. Price stalls after a decline; uncertainty dominates. Often precedes a deeper bear move or reversal.
4	Sideways / Range-Bound	Neutral/Transition	Near-zero mean return, low directional volatility. Market lacks conviction — price oscillates within a range, volume subdued.
5	Early Recovery / Accumulation	Bull Run	Modestly positive returns, volatility still below average. Smart money begins accumulating; price base-building after a downtrend.
6	Steady Uptrend	Bull Run	Consistent positive returns with controlled volatility. Broad participation, rising volume, price making higher highs and higher lows.
7	Momentum Bull / Euphoria	Bull Run	Highest mean returns, volatility re-expanding upward. Strong buying pressure, FOMO-driven volume spikes, parabolic price action.

Sub-state Rank Source FIXED

Sub-state ranks are derived directly from engine.bear_states and engine.bull_states — the sets frozen at fit time — rather than re-sorting by mean return at render time. This guarantees the sub-state is always consistent with the Detected Regime label:

Ranks 1–2 → always Bear states (bottom 2 by mean return at fit time)
Ranks 3–4 → always Neutral states (middle 2)
Ranks 5–7 → always Bull states (top 3)

Previously, an independent re-sort could place a Neutral state at rank 6 (labeled "Steady Uptrend") while the Detected Regime box showed "Neutral/Transition" — a contradiction caused by NaN mean-return values for rare states disturbing Python's sort order.

Sub-state vs. Confirmations

It is possible to be in a Bear/Crash sub-state while all technical confirmations are met. The HMM evaluates the statistical pattern of a sequence of bars; confirmations are point-in-time momentum signals evaluated on the current bar only. A classic scenario: a strong dead-cat bounce causes all confirmations to pass on a single day, while the HMM still classifies the session as Bear because the surrounding sequence of crash bars dominates the pattern. The regime always takes precedence — the strategy exits on Bear regardless of confirmation count.

📐 Engineered Features

Four features are computed from daily OHLCV data and fed into the GMM-HMM. All features are standardised (zero mean, unit variance) before fitting. Outliers beyond ±5σ are clipped to prevent HMM instability. An extra 150 calendar-day warmup is downloaded beyond the user-selected lookback so that SMA-100 and MACD-26 are valid from bar 1. Warmup rows are trimmed before backtesting.

#	Feature	Formula	Captures
1	`returns`	log(Closeₜ / Closeₜ₋₁)	Daily directional momentum
2	`range`	(High − Low) / Close	Intraday volatility / bar expansion
3	`vol_change`	RollingStd(Volume, 20) / RollingMean(Volume, 20)	Abnormal volume spikes (CoV)
4	`trend_return` (Option C)	log(Closeₜ / Closeₜ₋₂₀)	20-day cumulative trend — prevents uptrends being labelled Neutral

Live Data Pipeline

data_loader.py uses yf.Ticker.history(start=..., no end) instead of yf.download. The key difference: yf.download only returns fully-closed sessions, while Ticker.history without an end date always returns the current live bar. UTC date comparison (datetime.now(timezone.utc).date()) is used to avoid filtering out today's bar when the bar is labeled with a UTC timestamp.

🚦 Market Regimes

The Viterbi algorithm decodes the most likely sequence of hidden states for every bar. Regimes change at most once per day (daily bar). Only the first bar of a new regime triggers a trading action — ongoing regime bars have no new signal.

Regime	HMM States	Characteristics	Strategy Action
Bull Run	Top 3 by mean return (Sub-states 5, 6, 7)	Strong positive returns, expanding ATR, elevated volume, rising trend_return	Enter Long on transition (if ≥ min confirmations met)
Bear/Crash	Bottom 2 by mean return (Sub-states 1, 2)	Strongly negative returns, panic selling, liquidation cascades	Exit position after `bear_confirm_days` consecutive Bear bars
Neutral/Transition	Remaining 2 (Sub-states 3, 4)	Near-zero returns, sideways / consolidation / regime change	No change — hold if in position, stay flat if not

⚡ Strategy Logic

Transition-Based Entry & Exit

Unlike "always-in" strategies that enter on every Bull bar, this system is transition-triggered: actions fire only on the first bar of a new regime, not on every ongoing bar. The Bear exit requires bear_confirm_days (default 5) consecutive Bear-labeled bars before closing the position — preventing premature exits on single-bar noise. Once Bear is confirmed the position closes immediately (no minimum hold gate).

Transition	Position	Standard Mode	Regime-Only Mode
→ Bull Run	Flat	🟢 Enter Long (if ≥ min_confirms met)	🟢 Enter Long immediately
→ Bull Run	Long	🟢 Hold Long (already in)	🟢 Hold Long
→ Neutral	Long	🟡 Hold Long (no change)	🟡 Hold Long (no change)
→ Neutral	Flat	⚪ Stay flat	🟢 Enter Long (Bear→Neutral transition)
→ Bear/Crash	Long	🔴 Exit after bear_confirm_days consecutive Bear bars	🔴 Exit on first Bear bar
→ Bear/Crash	Flat	🔴 Stay in cash	🔴 Stay in cash
Ongoing Bull (Bull→Bull)	Long	🟢 Hold Long	🟢 Hold Long
Ongoing Bear (Bear→Bear)	Flat	🔴 Stay in cash (already exited)	🔴 Stay in cash (already exited)

Equity Mark-to-Market

At every bar, the equity curve is updated with the position's unrealised P&L (leveraged). This happens before the exit check, so the equity curve accurately reflects the floating value through both Bull and Neutral periods.

if position_open:
    bar_return     = (close - entry_price) / entry_price
    unrealised     = equity × leverage × bar_return
    current_equity = equity + unrealised     # updated every bar

✅ Technical Confirmations (10 Checks)

Entry on a Bull Run transition also requires a minimum number of technical confirmations to be True. The default threshold is 3 out of 10, configurable in the dashboard sidebar.

1 Positive Momentum (14-period)
2 Volatility Expansion (ATR > ATR MA)
3 Volume Above 20-period Average
4 ADX Trending (> 25)
5 Price Above SMA-50
6 MACD Histogram Positive
7 Stochastic %K > %D
8 SMA-20 > SMA-50
9 SMA-50 > SMA-100
10 RSI > 50

Each confirmation can be individually enabled or disabled in the dashboard sidebar. Disabled confirmations are excluded from the count and shown in grey on the scorecard. The entry threshold slider automatically adjusts to the number of enabled confirmations.

Note: When Regime-Only Mode is active, the entire confirmation gate is bypassed — entry fires immediately on any Bull (or Bear→Neutral) transition. The scorecard remains visible for reference but does not influence trading.

⚡ Regime-Only Mode

Toggling Regime-Only Mode in the sidebar removes the confirmation gate entirely. The strategy trades solely on HMM regime transitions — no technical indicators required. This is useful for benchmarking how much of the edge comes from regime detection alone versus the layered confirmation filters.

Entry Rules (Regime-Only)

Trigger	Action
Any transition from Bear/Crash to Bull Run	🟢 Enter Long immediately (no confirmation check)
Any transition from Bear/Crash to Neutral	🟢 Enter Long — treats Neutral as a potential recovery

Exit Rules (Regime-Only)

Trigger	Action
First Bear/Crash bar while long	🔴 Exit immediately on the very first Bear bar
Trailing stop fires (if enabled)	🔴 Exit; subject to `min_hold_days` gate

All other sidebar controls (leverage, trailing stop, min hold days, look-back period) remain active in Regime-Only Mode. The confirmation scorecard is still displayed for reference but shown with a note that it has no effect on trading.

🛡️ Risk Management

Parameter	Default	Description
Leverage	1×	Configurable in sidebar: 1× / 2× / 4×. Applied to every position's P&L.
Bear Confirm Days	5	Consecutive Bear-labeled bars required before the strategy exits a long position. Prevents noisy single-bar exits.
Re-entry Cooldown	2 days	After any exit (Bear confirmation or trailing stop), no new entries for 2 calendar days.
Trailing Stop	Disabled	Optional 2% trailing stop-loss. Toggle in dashboard sidebar. The stop tracks the highest close since entry.
Min Hold Days (trailing stop gate)	0	Trailing stop will not fire until the position has been held for at least this many days.
Min Confirmations	3 / 10	Number of enabled technical checks that must be True on a Bull transition bar (Standard Mode only).

🔍 Parameter Optimizer

Different assets have different volatility profiles. The Parameter Optimizer grid-searches the three most impactful strategy parameters on the asset's own historical data and surfaces the best combination — maximising Total Return.

Parameters Searched

Parameter	Search Grid	Description
`bear_confirm_days`	1, 2, 3, 4, 5, 6, 7	Consecutive Bear bars before exiting
`min_confirms`	1 → N active confirmations	Minimum technical checks required for entry
`min_hold_days`	0, 3, 5, 7, 10, 14	Minimum days held before trailing stop can fire

Up to 7 × 10 × 6 = 420 combinations are evaluated. Because the HMM is fitted only once (via _prepare()) and reused across all combinations, the full grid-search typically completes in seconds.

Max Grid Size

420

HMM Fits

Objective

Total Return %

Cache TTL

1 hour

🔮 10-Day Kernel Regression Forecast NEW

Instead of predicting a single price target, the forecast asks: "Given today's feature vector, what distribution of 10-day returns did similar historical situations produce?" A Gaussian kernel regression over past bars yields both an expected path and a ±1σ uncertainty band — more useful for position sizing than a point prediction.

Features (per bar)

Feature	Formula	Captures
`vol_surge`	volume / 20-day avg volume	Abnormal volume activity
`atr_ratio`	ATR(14) / close	Normalised realised volatility
`pct_from_high`	(close − 52-week high) / 52-week high	Distance from peak (drawdown)
`momentum_5d`	5-day return	Short-term directional momentum
`regime_num`	+1 Bull / 0 Neutral / −1 Bear	Current HMM regime context

Algorithm

Features are z-score normalised using the training portion of the data (all bars except the last 10). For today's feature vector, a Gaussian kernel weight exp(−d²/2h²) is computed against every historical bar, where h is the median pairwise distance (bandwidth heuristic). The kernel-weighted mean and standard deviation of the 10-day forward return paths give:

Expected path — kernel-weighted mean of historical 10-day forward returns, projected as price levels
+1σ band — upper confidence bound
−1σ band — lower confidence bound

Future dates are computed as business days for equities and calendar days for crypto tickers (detected by suffix: -USD, etc.). On the first bar (no prior history), the anchor falls back to using the open price.

Chart Rendering

All three paths are rendered as dashed amber lines anchored to today's close, extending 10 bars into the future. The expected return percentage is shown in the chart legend (e.g. "10-Day Forecast +3.2%"). The ±1σ bands are rendered at reduced opacity (rgba(255,215,64,0.4)) so they don't obscure the price action.

Forecast Horizon

10 days

Features

Kernel

Gaussian RBF

Cache TTL

5 min

🔍 Stock Screener UPDATED

scanner.py runs the same GMM-HMM pipeline on every constituent of the selected index universe and surfaces the stocks that have most recently entered a Bull Run regime, ranked by a composite entry-quality score.

Supported Index Universes

Universe	Tickers	Source
S&P 500	~500	Wikipedia table scrape
Nasdaq 100	~100	Wikipedia table scrape
Russell 2000	~2000	iShares IWM holdings CSV

Entry-Quality Score (0–100)

Component	Weight	Description
HMM Bull Probability	40 pts	Posterior probability mass across all Bull states for the most recent bar
Confirmation Signals Met	35 pts	Fraction of all 10 active technical signals satisfied at entry
Freshness (exponential decay)	25 pts	25 × e^{−0.2 × (bars_in_bull − 1)} — day-1 = 25 pts, day-3 ≈ 16.7 pts, day-5 ≈ 11.2 pts

Dashboard integration

The screener is accessible via the 📡 Backtester / 🔍 Stock Screener radio selector at the top of the sidebar. The index universe is selected from a dropdown (S&P 500, Nasdaq 100, Russell 2000). Results are cached for 2 hours per unique parameter combination.

📡 Dashboard (Streamlit) UPDATED

Running streamlit run dashboard.py launches the full "Regime Terminal" UI. The dashboard caches data for 5 minutes (300s TTL). Any sidebar change triggers an immediate re-run.

🏢 Company Profile Card NEW

Displays just below the page title: sector, industry, market cap, employee count, website link, and a business summary (truncated to 200 words). Fetched via yf.Ticker.info (24-hour cache). Hidden automatically for tickers with no available info (e.g. crypto).

🚦 Current Signal Panel

Shows detected regime badge with sub-state name (e.g. "State 6 · Steady Uptrend") in regime colour, recommended action (colour-coded), live asset price with PST timestamp, leverage, and confirmation count badge.

💰 Live Quote

Price is fetched via yf.Ticker.fast_info.last_price (60-second cache) rather than the last bar close. The "As of" timestamp reflects the current time in Pacific Time (PST/PDT). The last candle on the chart is also patched with live OHLCV (day_high, day_low, last_price).

📊 State Probabilities

Horizontal bar chart showing the probability of each of the 7 hidden states for the most recent bar. Bars are colour-coded by regime label.

✅ Confirmation Scorecard

All 10 confirmations listed in canonical order with ✅ Met / ❌ Not Met / ⏸ Disabled status badges. Disabled signals are shown in grey.

📈 Price Chart (TradingView-style) UPDATED

lightweight-charts v4.2.1 via CDN. Features: candlestick, regime background bands (solid wash, original green/red/amber at 0.35 opacity), no grid lines, SMA-50 overlay, volume histogram, buy/sell markers (sell shows P&L %), crosshair tooltip with prev-close day change (not open→close), and 10-day kernel regression forecast (dashed amber lines + ±1σ band).

📖 Regime Descriptions

Three regime cards (Bull/Bear/Neutral) plus a 7-state sub-state table ranked from most bearish (State 1 · Crash/Panic) to most bullish (State 7 · Momentum Bull).

💹 Equity Curve

Strategy equity vs. Buy & Hold comparison. Starting capital reference line. Filled area under the strategy curve for visual clarity.

📋 Performance Metrics

Total Return, Buy & Hold Return, Alpha, Max Drawdown, Win Rate, Sharpe Ratio with B&H Sharpe Ratio delta, Final Equity.

🗒️ Trade Log

Full table of every trade: Entry/Exit dates, prices, PnL (% and $), confirmations met at entry, regime transition label. Summary row + CSV download.

🎯 Parameter Optimizer Panel

Grid-searches bear_confirm_days, min_confirms, and min_hold_days to find the combination that maximises Total Return. Shows best-param cards, delta vs. current settings, and a top-10 results table.

Sidebar Controls

Control	Options / Range	Notes
Ticker Symbol	Any Yahoo Finance symbol — always displayed in ALL CAPS	Changing symbol triggers full re-run; ticker is auto-saved to watchlist
Watchlist Dropdown NEW	Saved tickers selectbox (alphabetically sorted)	Selecting a ticker populates the input and reruns. Auto-saves any typed ticker. ➖ Remove from List button. Persisted to `watchlist.json`.
🔄 Refresh Data & Re-run	Button — directly below Ticker Symbol	Clears data cache and re-downloads from yfinance
Look-back Period	365 – 1825 days (slider, step 30, default 365)	Default changed from 730 to 365
Leverage Factor	1× / 2× / 4× (radio)
Trailing Stop	Toggle on/off (2% trail)
Bear Confirm Days	1 – 10 (slider, default 5)	Consecutive Bear bars before position is closed
Min Hold Days	0 – 14 (slider, default 0)	Gates the trailing stop only
Starting Capital	$1,000 – $10,000,000 (number input)
⚡ Regime-Only Mode	Toggle on/off	Bypasses all confirmation gates
Confirmation Filters	10 individual checkboxes + Select All / Deselect All	Disabled in Regime-Only Mode
Min Confirmations Required	Slider 1 → N_enabled	Disabled in Regime-Only Mode
🔍 Optimize for this Asset	Button	Runs grid-search; cached 1 hour per ticker

Chart Cache Architecture

The TradingView chart function _build_tv_chart_html() uses @st.cache_data(ttl=300). Because Streamlit excludes underscore-prefixed arguments from the cache key, the cache key is explicitly formed from four non-underscore parameters:

ticker — busts cache on symbol change
live_key — busts cache when live price changes (e.g. "83412.5")
regime_key — busts cache when last-bar regime changes (e.g. "2026-03-10_Bull Run")
prediction_json — serialised kernel forecast; busts cache when forecast changes

This ensures the chart always reflects the current regime from backtester.df, eliminating stale-chart / fresh-signal mismatches.

Day-Change Tooltip Fix FIXED

The OHLC crosshair tooltip previously computed daily change as close − open. It now computes close − previous bar's close, matching the convention used by all professional charting platforms. A prevCloseMap (built from CANDLE_DATA at chart init time) provides O(1) lookup per bar.

🚀 Setup & Run

1 — Install dependencies

pip install -r requirements.txt

2 — (Optional) Run standalone data check

python data_loader.py

3 — (Optional) Run backtest in terminal

python backtester.py

4 — Launch the full dashboard

streamlit run dashboard.py

The dashboard will open in your default browser at http://localhost:8501. On first load it downloads ~365 daily candles (+150 warmup) and fits the GMM-HMM (~10–30 seconds). Subsequent loads use the 5-minute cache.

Requirements

yfinance
hmmlearn
pandas
numpy
scikit-learn
streamlit
plotly
pandas-ta

The price chart uses lightweight-charts v4.2.1 loaded from the unpkg CDN at runtime — no additional pip install required.