# The Investor's Indicator Catalog: A Reference for Signals Across Macro, Markets, Equities, Fixed Income, Commodities, and Crypto

## TL;DR
- This is a breadth-first catalog of the economic and financial indicators that matter for portfolio construction, organized by category, with each entry giving what it measures, its directional signal across asset classes, release/source details, and a candid assessment of reliability — the single most important organizing principle is that **most individual indicators are noisy and regime-dependent, so their value comes from combining them (diffusion indices, z-scores, multi-signal confirmation) rather than trusting any one in isolation.**
- The highest-conviction, best-documented signals are: the **yield curve** (10y–3m and 10y–2y) and **Conference Board LEI** for recession lead; **credit spreads (HY OAS)** and **financial conditions indices (NFCI)** for risk regime; the **Sahm rule** for real-time recession onset; **CAPE/valuation** for long-horizon (10y) equity returns but *not* short-term timing; and the academically robust **factor premia** (value, momentum, quality/profitability, low-vol), each with documented failure modes.
- For the quant/data-mining workflow, the dominant pitfalls are **look-ahead bias** (use point-in-time/vintage data — ALFRED, not revised FRED), **survivorship bias**, **regime change/non-stationarity**, and **multiple-testing / the "factor zoo"**; indicators should be transformed into stationary, comparable features (YoY changes, z-scores, percentile ranks, diffusion indices) before being fed to models like XGBoost.

## Key Findings

**Macro indicators split cleanly into leading, coincident, and lagging.** Leading: yield curve, LEI, building permits, ISM new orders, initial jobless claims, stock prices, consumer expectations. Coincident: nonfarm payrolls, industrial production, personal income, manufacturing/trade sales (these define the business cycle and NBER recession dating). Lagging: unemployment rate, CPI, average duration of unemployment, CPI services. A core trap is treating lagging indicators (unemployment, inflation) as forward signals.

**Market-based indicators usually lead survey/government data because they aggregate forward-looking capital.** Credit spreads, the yield curve, the VIX/MOVE, and financial conditions indices reflect real-time repricing and typically move months before official statistics confirm a turn. This is why they dominate "nowcasting" and regime models.

**Equity factors are real but cyclical.** Value, size, momentum, quality/profitability, and low-volatility have decades of academic support and exist across geographies and asset classes — but each suffers multi-year droughts (value 2010s; momentum crashes in sharp reversals like 2009 and March 2020). The Fama-French 5-factor model (2015) added profitability (RMW) and investment (CMA) and rendered value (HML) statistically redundant in the original sample; profitability is the most robust factor, size the weakest.

**Crypto on-chain metrics are genuinely informative for cycle extremes but immature and noisy.** MVRV, realized cap, SOPR, and NVT have identified Bitcoin cycle tops/bottoms historically, but with short histories (few cycles), high noise, definitional drift, and vulnerability to exchange/custody artifacts. Treat them as cycle context, not precise timing.

## Details

### 1. MACROECONOMIC INDICATORS

#### Growth / Output
- **GDP & GDP growth rate** — Broadest measure of output. Quarterly (US BEA), released with three vintages (advance, second, third) plus annual revisions. Coincident-to-lagging; heavily revised, so the *advance* estimate is noisy. Rising real GDP growth is broadly equity-positive and (if it stokes inflation) bond-negative; accelerating growth tends to steepen the curve. Limitation: backward-looking and revised substantially.
- **GNP** — GDP plus net income from abroad; largely superseded by GDP/GNI for most market work.
- **Industrial production & capacity utilization** — Monthly (Federal Reserve G.17). Coincident; IP is a CEI component. Capacity utilization above ~80% historically flags building inflationary/capex pressure. Cyclical sectors (industrials, materials, energy) are most sensitive.
- **Retail sales** — Monthly (US Census). Coincident with a slight lead on consumption; "control group" feeds GDP nowcasts. Volatile, revised; watch real (inflation-adjusted) vs nominal. Strong sales support consumer discretionary and can pressure bonds via growth/inflation.
- **Durable goods orders** — Monthly (Census). Leading for capex; "core" (nondefense capital goods ex-aircraft) is the cleaner signal and is an LEI input. Extremely volatile month to month (aircraft orders).

#### Labor Market
- **Nonfarm payrolls (NFP)** — Monthly (BLS Employment Situation, first Friday). The single most market-moving release; coincident but the headline drives rate expectations. Subject to large revisions and annual benchmark revisions (e.g., sizable downward benchmark revisions have repeatedly surprised markets). Strong NFP → higher yields, stronger dollar, often initially equity-positive unless it forces Fed hawkishness.
- **Unemployment rate (U3)** — Monthly (BLS, household survey). Lagging. Its *change* is the key: see Sahm rule.
- **Initial jobless claims** — Weekly (DOL). One of the timeliest leading labor indicators; an LEI component (inverted). A sustained rise toward/above ~250–300k historically flags labor deterioration. Noisy weekly; use 4-week MA.
- **Labor force participation rate** — Monthly. Structural/demographic; contextualizes the unemployment rate.
- **JOLTS (job openings)** — Monthly (BLS), released with a ~6-week lag. Openings/unemployed ratio is a labor-tightness gauge the Fed watches; the vacancy-rate angle underlies the Michaillat-Saez "Michez" two-sided recession rule.
- **Average hourly earnings / wage growth** — Monthly (NFP report) and the quarterly Employment Cost Index (cleaner, composition-adjusted). Wage acceleration is an inflation lead and bond-negative.

#### Inflation
- **CPI / Core CPI** — Monthly (BLS). The most market-moving inflation print; core excludes food & energy. Hot CPI → higher yields, hawkish Fed, dollar up, equities pressured (especially long-duration/growth). Limitation: shelter component lags market rents by ~12 months.
- **PCE / Core PCE** — Monthly (BEA). The Fed's *preferred* gauge (2% target is on headline PCE); broader, chain-weighted, lower-weighted on shelter than CPI. Core PCE is the policy-relevant series.
- **PPI** — Monthly (BLS). Pipeline/upstream inflation; can lead CPI for goods and feeds PCE components.
- **GDP deflator** — Quarterly; broadest economy-wide price measure.
- **Breakeven inflation (5y, 10y)** — Market-implied: **breakeven = nominal Treasury yield − TIPS real yield** (FRED T10YIE = DGS10 − DFII10). Daily, real-time inflation expectations. Rising breakevens signal rising inflation expectations (commodity-positive, bond-negative).
- **5y5y forward inflation swap / University of Michigan inflation expectations** — Forward-looking expectation gauges; the Fed weights "well-anchored" long-run expectations heavily. UMich is survey-based and can be volatile/partisan.

#### Monetary Policy & Liquidity
- **Federal funds rate / policy rates** — The risk-free anchor. Cuts are generally risk-asset-positive (unless recessionary), hikes the reverse. Forward guidance and the "dot plot" matter as much as the level.
- **Money supply M1/M2** — Monthly (Fed, FRED M2SL). Per the St. Louis Fed ("The Rise and Fall of M2," May 2023), M2 YoY growth hit **26.9% in February 2021** — a rate that "easily exceeds the rates of growth during either the quantitative easing programs of 2008-15 or the inflations of the 1970s and 1980s," the highest since modern record-keeping began in 1959 — and headline PCE inflation peaked "almost 18 months after the peak of M2 growth"; M2 then went negative for the first time in decades in late 2022. The Fed does not target M2, and the money→inflation link is loose outside extreme episodes (the pandemic combined record money growth with severe supply shortages).
- **Fed balance sheet / QE-QT** — Weekly (H.4.1 release, FRED WALCL). The balance sheet grew from ~$0.9T (2007) to ~$9T (April 2022 peak), then contracted under QT. QE works largely by compressing the term premium and pushing investors into risk assets (portfolio-balance channel); QT reverses it. "Net liquidity" proxies (WALCL − TGA − ON RRP) are widely tracked by traders.

#### Leading / Coincident / Lagging Composites
- **Conference Board LEI** — Monthly. Ten components (avg weekly manufacturing hours; initial claims; consumer-goods new orders; ISM new orders; core capital-goods orders; building permits; S&P 500; Leading Credit Index; 10y–fed funds spread; consumer expectations). Per The Conference Board, the LEI "is a predictive variable that anticipates (or 'leads') turning points in the business cycle by around seven months"; its "3Ds rule" triggers when the 6-month diffusion index ≤50 and the annualized 6-month growth rate falls below −4.3%. Strong recession-signaling record but prone to false positives (its prolonged 2022–2023 decline did not produce an NBER recession on the historically implied schedule).
- **Yield curve (10y–2y, 10y–3m)** — The premier recession lead. The 10y–3m spread is the academic favorite (NY Fed model); inversion has preceded every US recession since the 1960s with 6–24 month lead, though with notable false-signal risk (the Oct-2022 inversion had not produced a recession 2.5 years later as of early 2025). Distorted by a low/negative term premium.
- **ISM/PMI manufacturing & services** — Monthly (ISM, first business days; S&P Global flash mid-month). Diffusion indices centered at 50 (>50 expansion). New-orders subindex leads. Per the ISM Manufacturing PMI® Report: "A Manufacturing PMI® above 42.3 percent, over a period of time, indicates that the overall economy, or gross domestic product (GDP), is generally expanding; below 42.3 percent, it is generally declining" — i.e., manufacturing can contract (sub-50) while the broader economy still grows. ISM also publishes a specific GDP read each month (e.g., December 2025's 47.9% "corresponds to a 1.6-percent increase in real GDP on an annualized basis"). A beat is generally equity-positive/bond-negative; falling below 50 for several months flags slowdown.
- **Consumer confidence / sentiment** — Conference Board Consumer Confidence (more labor-weighted) and University of Michigan Consumer Sentiment (more inflation/finances-weighted). Soft data; weak short-term market predictor but useful for consumption turning points.

#### Housing
- **Housing starts & building permits** — Monthly (Census). Permits lead starts and are an LEI component; rate-sensitive, so they lead the cycle. **Existing home sales** (NAR, ~90% of market, lagging — counts closings) vs **new home sales** (Census, leading — counts signings). **Case-Shiller** home price index (lagging, ~2-month lag, 3-month MA). **NAHB Housing Market Index** (homebuilder sentiment, leading, monthly).

#### Trade / External & Fiscal
- **Trade balance / current account** — Monthly/quarterly. Persistent deficits can pressure a currency over time; capital-flow data (TIC) shows foreign demand for US assets. **Terms of trade** matter for commodity-exporter currencies (AUD, CAD, BRL). **Government debt/deficits & fiscal impulse** — the *change* in the structural deficit (fiscal impulse) is the growth-relevant signal; large issuance affects the term premium and long yields.

### 2. MARKET-BASED & FINANCIAL INDICATORS

#### Rates & the Yield Curve
- **Treasury yields across maturities** — The backbone. Level, slope, and curvature each carry information.
- **Real yields (TIPS)** — FRED DFII10 (10y TIPS). Rising real yields compress valuations of long-duration/growth equities (higher discount rate on distant cash flows) and are strongly negative for gold (Erb & Harvey estimated the real-yield/gold correlation near −0.82; gold pays no yield so rising real rates raise its opportunity cost). Falling/negative real yields fueled the 2019–2021 gold and growth-stock rallies.
- **Term premium** — The NY Fed **ACM model** (Adrian, Crump, Moench) decomposes yields into expected-rate-path + term premium, defined verbatim as "the compensation that investors require for bearing the risk that short-term Treasury yields do not evolve as they expected." It has averaged ~1.6% since 1961 with an all-time peak of 5.15% in May 1984; it "turned negative for the first time in the recorded series in 2014 — and stayed negative or near zero for most of the decade that followed." A depressed term premium "confounds" the yield-curve recession signal (BIS), because it can invert the curve for non-recessionary reasons.

#### Credit Markets
- **Credit spreads (IG & HY OAS)** — FRED BAMLH0A0HYM2 (ICE BofA US HY OAS), BAMLC0A0CM (IG). The most reliable early warning for recessions/risk-off; HY spreads widen months before equities peak. Rough framework from historical data: <300bps complacency, 300–450 normal, 450–600 caution, >600 historically preceded recession within 12–18 months ~85% of the time; sustained crossing of 800bps coincided with NBER recessions in 5 of 6 episodes 1997–2025. HY widening while IG holds = idiosyncratic stress; both widening = systemic. Caveat: index credit-quality composition shifts over time complicate cross-cycle comparisons.
- **TED spread** — 3m LIBOR − 3m T-bill; classic bank-funding-stress gauge, now largely supplanted by SOFR-based spreads post-LIBOR.
- **CDS spreads** — Issuer/sovereign default-risk pricing; CDX/iTraxx indices for the market.

#### Volatility
- **VIX** — 30-day implied vol of S&P 500 from SPX options ("fear gauge"). Spikes in risk-off; mean-reverting. Note implied typically exceeds subsequent realized vol (variance risk premium). A low VIX = complacency, not safety.
- **MOVE index** — The "VIX for Treasuries"; 1-month implied vol across the curve. Highly sensitive to central-bank/policy uncertainty and convexity hedging. Rising MOVE tightens financial conditions broadly.
- **Realized vs implied vol** — The gap (variance risk premium) is itself a signal; a wide VIX-minus-realized gap means options markets price risk the cash market hasn't acknowledged.
- **VVIX** — Vol-of-vol (implied vol of VIX options). Extremely high VVIX (≥~125) has historically preceded VIX declines (mean reversion); VVIX spikes with a flat VIX are a divergence/hedging-demand signal.

#### Liquidity / Funding & Financial Conditions
- **SOFR / repo rates** — Secured overnight funding; spikes (e.g., Sept 2019 repo) signal reserve scarcity. Watch alongside the Fed's reserve/ON-RRP balances.
- **Financial conditions indices** — **Chicago Fed NFCI** (105 indicators across money, debt, equity, and shadow-banking markets; mean 0, SD 1; positive = tighter than average; weekly). The adjusted ANFCI strips out the business-cycle component. **GS FCI** and **Bloomberg FCI** are proprietary alternatives. Tightening FCIs precede growth slowdowns and risk-asset weakness.

#### Currency
- **DXY dollar index** — ICE-published geometric basket: euro 57.6%, yen 13.6%, pound 11.9%, CAD 9.1%, krona 4.2%, franc 3.6%. Effectively a EUR/USD proxy; contains no yuan or EM currencies. A strong dollar pressures US multinational earnings (per Apollo's Torsten Sløk, "41% of revenues in the S&P 500 come from abroad"), tightens global financial conditions, pressures EM (dollar-debt servicing) and commodities/gold (priced in USD). The **"dollar smile"** (Stephen Jen): USD strengthens in both global risk-off (safe-haven Treasury demand) and US outperformance, weakening in calm synchronized global growth.
- **Real effective exchange rate (REER), carry, PPP** — REER (BIS) for trade-weighted valuation; carry (rate differentials) drives FX returns but crashes in risk-off; PPP for long-run fair value.

#### Sentiment / Positioning
- **AAII sentiment** — Weekly retail survey (bull/bear/neutral). Contrarian: extreme bearishness (>60% bears, as in April 2025) has preceded above-average forward returns; extreme bullishness precedes mediocre returns. Modest standalone power.
- **Put/call ratios** — CBOE equity-only is the cleaner sentiment read. Contrarian at extremes; best with a 21-day MA (McMillan). Not a precise timing tool.
- **CFTC Commitment of Traders (COT)** — Weekly futures positioning (commercials/hedgers vs large speculators vs small). Extreme spec positioning is a contrarian/medium-term signal across commodities, FX, and rates.
- **Fund flows, margin debt, short interest** — Flows and margin debt as "dumb money"/leverage gauges; high short interest is a contrarian/squeeze setup. All best as extremes/contrarian context.

### 3. EQUITY-SPECIFIC INDICATORS & FACTORS

#### Valuation
- **P/E (trailing & forward)** — Forward P/E is the standard street multiple; depends on (often optimistic) estimates. **Shiller CAPE** (price / 10-year inflation-adjusted average earnings) is the premier *long-horizon* valuation gauge: it has explained a large share of the variation in subsequent 10-year real returns (R² often cited ~0.5–0.8 in post-war samples), but it is useless for short-term timing, has trended structurally higher (accounting changes, lower rates), and critics note it persistently underestimated returns post-2009. **P/B, P/S, EV/EBITDA** for cross-sectional/relative value; EV/EBITDA is capital-structure-neutral. **Dividend yield & earnings yield** (E/P). **Equity risk premium** (earnings yield − real bond yield) and the **"Fed model"** (E/P vs 10y yield) — the Fed model is widely criticized for comparing a real (E/P) to a nominal yield.

#### Quality / Profitability & Solvency
- **ROE, ROA, ROIC** — Profitability/efficiency; ROIC vs WACC is the value-creation test. **Profit margins, FCF yield** — FCF yield is a cash-based quality/value hybrid. **Debt/equity, interest coverage** — leverage/solvency.
- **Piotroski F-score** (0–9, nine binary tests across profitability, leverage/liquidity, efficiency) — designed to separate strong from weak *value* stocks; high-F-score value stocks historically outperformed. **Altman Z-score** — bankruptcy-risk model (working capital, retained earnings, EBIT, market value, sales); ~72% accurate two years out in original 1968 test, ~80–90% one year out in later samples, with meaningful false-positive rates. Criticized as descriptive rather than truly predictive.

#### Growth & Estimates
- **Earnings/revenue growth, earnings revisions, analyst estimates** — Earnings-revision breadth (up vs down revisions) is a documented momentum-adjacent signal; estimate dispersion proxies uncertainty.

#### Factor Investing
- **Value (HML)** — Cheap (high book-to-market) beats expensive long-run; deep drawdown through the 2010s. Modern critiques argue book value is impaired for intangible-heavy firms.
- **Size (SMB)** — Small beats large; the weakest/least robust factor, possibly subsumed by quality (small junk underperforms).
- **Momentum (UMD/WML)** — 12-1 month (skip the most recent month to avoid short-term reversal); Jegadeesh-Titman (1993), "Returns to Buying Winners and Selling Losers," found the winner-minus-loser portfolio "generated average returns of about 1.31% per month," with a long-short top-minus-bottom-decile portfolio earning ~12%/yr over the 1965–1989 sample; replicated in 40+ markets and across asset classes (Asness-Moskowitz-Pedersen "Value and Momentum Everywhere"). Known failure mode: **momentum crashes** in sharp market reversals (1932, 2009, March 2020); volatility-scaling mitigates. Not explained by CAPM or FF3.
- **Quality / Profitability (RMW)** — Robust-minus-weak operating profitability; the most robust of the FF5 additions (cited ~4.7%/yr in the original study). **Investment (CMA)** — conservative (low asset-growth) beats aggressive.
- **Low-volatility / low-beta** — Low-risk stocks earn higher risk-adjusted returns (the "low-vol anomaly"), contradicting CAPM; not in FF5, a noted shortcoming.
- Measurement note: factors are built as long-short, sorted portfolios (e.g., HML = top minus bottom book-to-market deciles), available from the Kenneth French Data Library.

#### Technical Indicators
- **Moving averages & crossovers** — 50/200-day; the "golden cross" (50 above 200) and "death cross." Trend/lagging; the 200-day MA is a widely used long-term regime filter with some evidence of drawdown reduction (time-series momentum). Whipsaws in choppy markets.
- **RSI** — Momentum oscillator (overbought >70 / oversold <30). **MACD** — moving-average convergence/divergence; explicitly lagging, so crossovers can buy late. **Bollinger Bands** — volatility envelopes (mean reversion / squeeze). **ADX** — trend strength (not direction). **OBV** — volume-based accumulation/distribution.
- **Breadth indicators** — **Advance/decline line**, **% of stocks above 200-day MA**, **new highs−new lows**, **McClellan oscillator** (19-day EMA − 39-day EMA of net advances) and Summation Index. Breadth divergences (index rising on narrowing participation) are classic warning signs.
- Evidence note: academic findings on technical rules are mixed and regime/market dependent. Some studies find optimized MA/MACD rules beat buy-and-hold in certain (often less efficient) markets, but profitability tends to decay after publication and after transaction costs; survivorship and data-snooping inflate reported results. Treat technicals as risk-management/timing overlays and confirmation, not standalone alpha.

#### Earnings / Corporate
- **Earnings surprises & guidance** — Post-earnings-announcement drift (PEAD) is a documented anomaly: positive surprises tend to be followed by continued drift. Guidance often moves stocks more than the print. **Buybacks** (shareholder yield), **insider transactions** (cluster buying is a mild bullish signal).

### 4. FIXED INCOME INDICATORS
- **Duration & convexity** — Duration = price sensitivity to yield (interest-rate risk); convexity = the curvature/second-order correction (positive convexity benefits bondholders in large moves; MBS have negative convexity). **Yield to maturity** — the standard total-return-if-held proxy.
- **Real vs nominal yields, breakevens** — As above (TIPS, T10YIE). Real yields drive cross-asset valuation.
- **Credit quality & rating migration** — Rating transitions/downgrade cycles; "fallen angels" (IG→HY) and rising-star dynamics. Downgrade waves are credit-cycle markers.
- **Curve shape as regime signal** — Inversion (recession lead), bull/bear steepening/flattening: bull steepening (short rates falling faster — Fed easing) is typically early-recovery/risk-on; bear flattening (short rates rising — Fed tightening) is late-cycle. Disinversion (re-steepening) after inversion often immediately precedes the recession.

### 5. COMMODITY INDICATORS
- **Supply/demand fundamentals & inventories** — EIA weekly crude/product inventories (Wednesdays) and natural-gas storage (Thursdays) are major price catalysts; builds are bearish, draws bullish. OPEC production, rig counts (Baker Hughes), and USDA WASDE for ags.
- **Backwardation vs contango & roll yield** — Backwardation (spot > futures) = tight physical market, positive roll yield for long futures holders; contango = ample supply/storage, negative roll yield that erodes futures-based ETF returns. The curve shape is a real-time supply/demand signal.
- **Dr. Copper** — Copper as a global-growth barometer given its construction/electrical/EV ubiquity; copper backwardation signals industrial tightness. The **copper/gold ratio** tracks the 10-year yield and risk appetite (rising = pro-growth/higher yields). The **gold/oil ratio** is a recession/stress gauge (spikes when oil collapses or gold spikes in crises).
- **Broad indices** — Bloomberg Commodity Index (BCOM) and the older CRB/Refinitiv index for diversified commodity beta.
- **China demand signals** — China PMIs, credit (total social financing), and property data are leading for industrial-metals and bulk-commodity demand.

### 6. CRYPTO-SPECIFIC INDICATORS
- **On-chain valuation** — **Realized cap** (each coin valued at its last on-chain transaction price = aggregate cost basis) and **MVRV** (market cap / realized cap). MVRV >~3.5 has flagged BTC cycle tops (2017 ~4.0, 2021 ~3.2); <1 = capitulation/undervaluation. **MVRV Z-score** has historically pinned major tops to within ~2 weeks. **NVT** (network value / transaction volume) = a P/E-like valuation gauge; high NVT = price detached from on-chain activity.
- **SOPR** (Spent Output Profit Ratio) — >1 coins moving at a profit, <1 at a loss; the break of 1 is a sentiment/regime marker (STH-SOPR especially watched). **Active addresses, hash rate, exchange flows** — net inflows to exchanges = potential sell pressure; sustained outflows to cold storage = accumulation.
- **Sentiment/positioning** — **Funding rates** (perpetual futures; persistently high positive = crowded longs, squeeze risk), **open interest** (leverage build-up), **Fear & Greed index** (contrarian at extremes).
- **Market structure** — **Stablecoin supply/flows** (dry powder/liquidity), **BTC dominance** (rotation between BTC and alts).
- Reliability caveat: these metrics have only a few cycles of history, high noise, definitional drift across providers (Glassnode, CryptoQuant, Coin Metrics, Santiment), and are corruptible by exchange/custody movements and ETF-era changes in on-chain behavior. Best used as confluence for cycle positioning, not precise trade timing.

### 7. CROSS-ASSET & REGIME INDICATORS
- **Recession probability models** — NY Fed yield-curve probit (10y–3m); the **Sahm rule** (recession when the 3-month MA of U3 rises ≥0.50pp above its trailing-12-month low) — published on FRED (SAHMREALTIME/SAHMCURRENT), only two false positives since 1959, signals onset (coincident-early, not a long lead); the Michaillat-Saez two-sided variant adds vacancies. The St. Louis Fed smoothed recession probabilities and GDP-based indicator.
- **Risk-on/risk-off** — Composite reads from credit spreads, VIX/MOVE, the dollar, cyclicals-vs-defensives, copper/gold, and high-beta-vs-low-beta. FCIs synthesize these.
- **Intermarket relationships** — Stocks vs bonds (the stock-bond correlation flipped positive in the 2022 inflation regime, breaking the diversification that held during the disinflation era); dollar vs commodities (inverse); credit vs equity (credit usually leads equity at turns).
- **Business-cycle / regime frameworks** — Sector/asset rotation across the cycle (e.g., the four-phase "investment clock": early-cycle favors cyclicals/credit/small caps; mid-cycle equities/commodities; late-cycle energy/materials/value and inflation hedges; recession favors Treasuries, quality, defensives, cash). Regimes can be identified via growth+inflation quadrants (reflation/goldilocks/stagflation/deflation), Markov-switching models, or simple rules over LEI/PMI/credit.

### 8. DATA-MINING / QUANT ANGLE

**Data sources & APIs**
- **FRED / ALFRED** (St. Louis Fed) — macro/rates/credit series; ALFRED provides *vintage (point-in-time)* data essential to avoid look-ahead bias. `fredapi`/`quantmod`.
- **Nasdaq Data Link** (formerly Quandl), **Bloomberg/Refinitiv** (institutional), **yfinance / tidyquant / quantmod** (prices), **EIA API** (energy), **CFTC** (COT), **Kenneth French Data Library** (factor returns), **OECD/IMF/World Bank/BIS** (global macro), **Glassnode / CryptoQuant / Coin Metrics / Santiment** (on-chain), **CBOE** (vol/put-call), **Conference Board** and **ISM** (LEI/PMI, often licensed).

**Pitfalls**
- **Look-ahead bias** — Using data not yet available (revised macro prints, restated financials, same-day signal-and-trade, full-sample normalization). Fix: vintage/point-in-time data, lag releases to their actual publication date, roll all normalizations.
- **Data revisions** — Macro series (GDP, payrolls) are heavily revised; backtest on the *first-release* vintage.
- **Survivorship bias** — Excluding delisted/bankrupt names inflates equity backtests; use survivorship-bias-free databases (e.g., CRSP).
- **Regime change / non-stationarity ("concept drift")** — Relationships shift (stock-bond correlation, factor leadership); favor walk-forward/out-of-sample testing and regime-aware models.
- **Overfitting & multiple testing / "factor zoo"** — Hundreds of published factors; mining many candidates guarantees false positives. Apply deflated Sharpe ratios, higher t-stat hurdles (Harvey-Liu-Zhu argue ~3.0+), cross-validation, and economic priors. SHAP can explain a model but does not cure data-snooping.

**Feature transformations**
- **YoY / MoM changes** and log-differences to induce stationarity; **z-scores** (rolling) for cross-sectional and time-series standardization; **percentile/rank** transforms for robustness to outliers and non-linearity; **diffusion indices** (share of series rising) for breadth; **spreads/ratios** (curve slopes, copper/gold); **surprise** measures (actual − consensus, standardized); **moving-average and momentum** transforms; **volatility scaling** for risk parity and momentum-crash mitigation. Tree models (XGBoost) handle monotone non-linearities and interactions well but still require leakage-free, point-in-time feature construction and careful cross-validation (purged/embargoed k-fold for time series).

## Recommendations

**Stage 1 — Build the regime dashboard first (highest signal-to-noise).** Stand up a weekly/daily regime monitor combining: (a) yield curve (10y–3m and 10y–2y), (b) HY OAS with the threshold framework (300/450/600/800bps), (c) Chicago Fed NFCI/ANFCI, (d) Sahm rule, (e) Conference Board LEI 6-month diffusion, (f) ISM new orders, (g) VIX/MOVE, (h) copper/gold and the dollar. This set has the best-documented lead/coincident properties and frames everything else. **Threshold to act:** when ≥3 of {curve inversion + re-steepening, HY OAS >600bps and widening, NFCI positive and rising, Sahm ≥0.5, LEI 6m diffusion deeply negative} align, shift toward defensive positioning and net-short tilts; when they reverse, add risk.

**Stage 2 — Layer cross-sectional equity factors for the long/short books.** Use value, momentum, quality/profitability, and low-vol as the core sleeves, each as standardized cross-sectional z-scores, with **explicit crash controls on momentum** (volatility scaling; cut momentum exposure when market volatility spikes and after deep drawdowns, the 2009/2020 reversal pattern). Long the high-composite-score names, short the low-score names. **Benchmark to change the mix:** if value's 12-month factor return is in a deep drawdown while momentum and quality are strong, do not abandon value but cap its weight; rebalance factor weights toward what the regime favors (early-cycle → value/size/cyclicals; late-cycle/recession → quality/low-vol/defensives).

**Stage 3 — Asset-class overlays.** Bonds: trade duration off real yields, breakevens, and curve shape; watch rating-migration/downgrade waves for credit. Commodities: prioritize curve shape (backwardation/contango, roll yield) and inventories over spot; use Dr. Copper/China data for industrial demand. Crypto: use MVRV/MVRV-Z, SOPR, exchange flows, and funding rates as *cycle-extreme confluence* and size positions small given noise. Options: use VIX term structure, VVIX, skew, and the variance risk premium to time vol selling/buying; remember implied > realized on average.

**Stage 4 — Engineer the data pipeline for honesty.** Source vintage data (ALFRED) for all macro features, lag every release to actual publication, use survivorship-bias-free equity data, transform to stationary features (YoY, rolling z-scores, ranks, diffusion), and validate with purged/embargoed walk-forward CV. Apply multiple-testing discipline (elevated t-stat hurdles, deflated Sharpe) before trusting any mined signal. **Threshold to deploy:** require out-of-sample/walk-forward robustness and economic rationale, not just in-sample fit.

## Caveats
- **Predictive value is debated for many of these.** Technical signals (MA/MACD/RSI) show mixed, decaying, market-dependent evidence and are best as overlays, not standalone alpha. The Fed model is theoretically flawed (real vs nominal). CAPE works at 10-year horizons but not for timing and has a structural-drift problem. Factor premia suffer multi-year droughts (value 2010s) and crashes (momentum).
- **Recession tools can mislead.** The yield curve gave a long false signal in 2022–2024; the LEI's 2022–2023 decline did not produce an NBER recession on the historically implied schedule; the Sahm rule is a real-time heuristic its creator cautions can be distorted by unusual labor-supply shifts.
- **Regime dependence is pervasive.** The stock-bond correlation, the dollar's safe-haven behavior ("smile" weakening per some 2024 analyst views — an opinion, not established fact), and factor leadership all shift with the macro regime. Relationships estimated in one era can reverse.
- **Crypto on-chain metrics are immature** — few cycles, provider-dependent definitions, and vulnerability to ETF-era and exchange/custody distortions.
- **Numbers age.** Specific spread/valuation levels and thresholds cited here are historical reference points, not current readings; refresh against live data before acting. Several threshold "accuracy" statistics (e.g., yield-curve and OAS hit rates) come from secondary analyses and vary with the sample window and recession-dating choices.