Appendix B — Appendix: Indicator Council Deliberation

Purpose. Which indicators from the strategy.md catalog are genuinely useful for predicting the best-performing investments in this project — and how should they be used? The same six domain experts convened in Phase 1 (Advisory Council) debated the question against the Phase-2 data snapshot. This appendix preserves that deliberation, including the disagreements that did not resolve and a blind spot that changed the data pipeline.

The question:

Which indicators (yield curve & HY OAS, equity factors, VIX & financial conditions, crypto on-chain) are genuinely useful for predicting the best-performing investments here — and is the answer a single signal or a combination?

The council: Quantitative Analyst (empiricist), Political Scientist (measurement/construct validity), IR Theorist (regime change), Macro / International-Business Economist (transmission channels), Market-Efficiency Contrarian (devil’s advocate), Risk Manager (survivability).

B.1 Opening Positions

B.1.1 🎭 The Quantitative Analyst

Position: Only the market-based regime signals (curve, HY OAS, NFCI, VIX), used as one walk-forward-validated, equal-weighted classifier, survive scrutiny — the textbook cross-sectional factors (value/momentum/quality) are unbuildable on a 7-ETF-plus-1-stock universe and should be dropped, not faked.

Reasoning: The binding constraint is data, not theory: there is no single-name cross-section to sort into deciles, so importing French-Library factor returns onto a 21-instrument tactical book is a look-ahead Frankenstein. What can be built is a regime model from series that lead survey data.

Favors: HY OAS, yield-curve slope and disinversion, NFCI/ANFCI, VIX (as a vol-scaling input), and a combined diffusion/z-score score. Distrusts: cross-sectional factors, single-name fundamentals, crypto MVRV, CAPE-for-timing, and lagging series (UNRATE/CPI) read as leads.

Key Risk: 2022–24 is fatal to single or naively-stacked models (curve inverted, no NBER recession arrived). With ~3–4 genuine regimes in 25 years the effective sample size is single-digit; in-sample Sharpes are mirages.

Surprising Insight: The catalog’s “combine everything” maxim is itself an overfitting risk at this n — combine for robustness (small, fixed, pre-registered equal-weight vote), not fit. Constrained ignorance beats estimated precision when n is tiny.

B.1.2 🎭 The Political Scientist

Position: These indicators are valid measurements of market consensus, not predictors; the only tradable alpha in a conflict-aware mandate is the surprise component (realized − anticipated), which the catalog never constructs.

Reasoning: Event-study methodology is explicit — only the unanticipated part of an event moves prices. Credit spreads, the curve, VIX, NFCI are price-derived: high construct validity as a regime thermometer, near-zero edge as a predictor, because you cannot front-run the market’s own vote.

Favors: standardized surprise/divergence measures; HY OAS & curve as the anticipated baseline; NFCI as a low-noise breadth aggregate; commodity curve shape & gold/oil. Distrusts: MVRV (the construct itself changed mid-sample), fixed-threshold regimes, VIX level as a directional signal.

Key Risk: Endogeneity/reflexivity — the market indicators are partly made of the prices you’re forecasting, so a regime model stacked on them regresses returns on a noisy transform of returns.

Surprising Insight: A geopolitical edge comes from measuring the anticipation structure of conflict — markets under-react to slow-burn escalation and over-react to discrete shocks. Build a conflict surprise index and trade the gap, not the event.

B.1.3 🎭 The IR Theorist

Position: Trust market-priced stress gauges as a combined regime model — never a single signal — and treat factors and especially MVRV as regime-contingent relationships the geopolitical era can break without warning.

Reasoning: Relationships hold until a structural shift rewrites them. The curve’s 2022–24 false signal is exactly the unfalsifiable-after-the-fact pattern to distrust; what survived was HY OAS and NFCI, which reprice forward-looking capital continuously.

Favors: HY OAS, NFCI/ANFCI, VIX/MOVE as a filter, the curve’s re-steepening/disinversion (the dynamic, not the level), copper/gold. Distrusts: standalone curve inversion, MVRV, fixed factor leadership, LEI, Sahm-as-forward.

Key Risk: Non-stationarity of meaning — BTC flipped from “risk-on tech” (2021) to “capital-flight/sanctions” asset (2022); the stock-bond correlation flipped positive in 2022. The same shock hits the same ticker through opposite channels by regime, invisible to in-sample CV.

Surprising Insight: For a conflict book, the most predictive indicator has the shortest memory, not the longest lead — you cannot lead a surprise, only price it faster. The catalog’s prestige hierarchy (7-month-lead tools) is inverted here.

B.1.4 🎭 The Macro / International-Business Economist

Position: Trust the market-priced regime signals because they are the transmission channel repricing in real time; use them to set the risk budget and tilt the conflict-sensitive sleeves — never as standalone alpha — and retire what the project cannot obtain.

Reasoning: HY OAS is the cleanest read on whether a geopolitical shock (oil spike, sanctions, freight blowout) is becoming a credit-cycle event; the curve carries a separate, lower-frequency policy signal that can throw multi-year false positives.

Favors: HY OAS (300/450/600/800 bps), NFCI/ANFCI, breakevens (T10YIE) + real yields (DFII10) to sort shocks into reflation vs stagflation, copper/gold + dollar, commodity spot-vs-proxy pairs. Distrusts: MVRV, LEI (unobtainable), factor premia, standalone curve, VIX as more than a confirmer.

Key Risk: Using lagging/unobtainable indicators as forward signals, and non-stationarity in the transmission itself.

Surprising Insight: The most useful “indicator” isn’t on the list — it’s the shape of the commodity transmission (spot/curve vs equity proxy, conditioned on breakevens vs real yields). The same oil spike is a buy in a reflation quadrant and a sell in stagflation. You’re not predicting assets; you’re classifying the regime that decides which channel a shock flows through.

B.1.5 🎭 The Market-Efficiency Contrarian

Position: Almost none of these give a retail investor tradable alpha net of costs and decay; their only honest use is a slow, regime-aware risk-budget dial — and the project must prove it against dumb buy-and-hold before believing any of it.

Reasoning: Every signal is public, free, and watched by everyone with a Bloomberg terminal, so its content is impounded before a retail investor on publication-lagged FRED data can act. The author’s own caveats are the EMH critique: the curve’s false 2022–24 signal, momentum crashes, value’s lost decade.

Favors: HY OAS (to scale gross exposure, not time entries), NFCI/ANFCI, VIX as contrarian context, and CAPE — precisely because its 10-year horizon is one institutions won’t arbitrage away. Distrusts: MVRV, factors (post-publication decay), curve-timing, technicals, Sahm/LEI.

Key Risk: Combining many weak, correlated, public signals doesn’t manufacture an edge — it manufactures false confidence, multiplying researcher degrees of freedom and justifying oversized bets right before the regime shifts.

Surprising Insight: The catalog’s most valuable line isn’t an indicator — it’s the admission that the only durable retail edge is patience and time horizon. Keep the signals that pay off only at horizons institutions can’t hold; everything daily/weekly is the game retail can’t win.

B.1.6 🎭 The Risk Manager

Position: Indicators earn their place only if they survive position-sizing, costs, taxes, and drift; the only defensible use is a small combined regime overlay that governs gross exposure and drawdown limits — never single-signal, never the factor/MVRV machinery the data can’t support.

Reasoning: The curve’s 2022–24 false signal is the whole case against single-signal use — a curve-only short would have bled carry and whipsaw for two-plus years, a drawdown a personal account may not survive even if eventually “right.”

Favors: HY OAS (de-risk early and cheaply), the yield curve as a low-turnover exposure throttle, VIX for volatility-scaling, NFCI/ANFCI as a pre-combined gauge. Distrusts: MVRV, CAPE-timing, momentum without vol-scaling, lagging series as signals, any factor long/short.

Key Risk: The gap between a signal and net P&L — publication lag, transaction costs, bid/ask on thin ETFs (BDRY), and short-term-capital-gains tax on every regime flip can turn a positive-Sharpe paper signal into negative after-tax alpha.

Surprising Insight: The regime model’s real output is a position-size multiplier, not a return forecast — volatility-scaling and exposure-throttling improve Sharpe through risk reduction in bad regimes, so you can profit from the curve and VIX without ever predicting a recession date.

B.2 The Rebuttal Round

Quant → Risk Manager: “Your premise is wrong — HY_OAS, ANFCI, breakevens, real yields, INDPRO, copper are all on disk now; the real defect is that HY_OAS starts 2023-06-05 with zero recessions, which kills the >600 bps threshold you wanted to lean on.”
Risk Manager → Quant: “A deflated-Sharpe walk-forward on a credit series with 2.5 years and not one recession is meaningless — you can’t embargo your way out of having no out-of-sample regime.”
Political Scientist → Macro/IR: “You worship HY OAS and NFCI as forward-looking, but they’re consensus thermometers — regressing returns on them is regressing returns on a transform of returns; without a real surprise series (absent from the snapshot) your regime model is reflexive.”
IR Theorist → Quant: “Your backtest certifies a pattern held in 2000–2023, but BTC’s meaning inverted and the stock-bond correlation flipped positive in 2022 — invisible to any in-sample CV.”
Contrarian → everyone favoring a diffusion model: “Stacking ~6 correlated public series over 3–4 genuine regimes manufactures false confidence, not an edge — and 2008 isn’t even in the HY_OAS file.”
Macro → Political Scientist: “‘Measure the surprise’ is right in theory but unbuildable here — no consensus feed exists in the snapshot. The implementable version is the reflation-vs-stagflation quadrant from breakevens (T10YIE) vs real yields (DFII10), which is on disk.”

B.3 Synthesis

Points of convergence

All six reject single-signal trading and any standalone yield-curve trigger (the 2022–24 false inversion) — converging on strategy.md’s core claim that value comes from combining signals.
Five of six hold that market-priced regime gauges (credit spreads, NFCI/ANFCI, curve, VIX) are the most defensible inputs because they reprice forward-looking capital faster than survey data.
Five of six distrust crypto on-chain MVRV for this project (few cycles, provider drift, ETF distortion — and it isn’t in the free snapshot).
Four of six call cross-sectional equity factors a category error on this universe (no survivorship-free single-name cross-section).
Four of six name non-stationarity / regime drift the dominant risk; a clean backtest is necessary but nowhere near sufficient.
Strong convergence that the honest output is a slow-moving risk-budget / gross-exposure dial and a volatility-scaling input — not a return forecast or buy/sell trigger.

Core Tension: Backtestable-stationarity vs. theory-grounded regime-change. The Quant holds that only what survives purged walk-forward CV is real; the IR Theorist, Political Scientist, and Macro economist counter that the dominant risk is exactly what CV is blind to — a signal’s meaning silently inverting. It won’t resolve because the data can’t adjudicate it: with ~3–4 regimes in 25 years there is no out-of-sample large enough to settle theory empirically. The data forces humility on the quant and unfalsifiability on the theorists.

The Blind Spot: No expert had inspected the snapshot before pronouncing on it. The ICE BofA credit spreads (HY_OAS/IG_OAS) that five of them ranked highest exist only from ~2023 on the free tier — zero past recessions in-sample — so the entire threshold framework cannot be calibrated on data the project owns. The most-trusted signal is also the shortest and recession-free, which inverts the priority ranking.

Recommended Path (three layers, respecting the tension): 1. Fix the data first (baseline-first): get a credit spread that spans real recessions; pull NFCI and Sahm. 2. Build only the regime layer the data supports, as a risk dial: a small, fixed, equal-weighted, pre-registered diffusion vote over economically-motivated thresholds {curve disinversion, credit widening, ANFCI positive-and-rising, VIX regime-shift} — no fitted weights — whose only output is a gross-exposure multiplier and volatility-scaling. Add the reflation-vs-stagflation quadrant (T10YIE vs DFII10) to route the conflict-sensitive sleeve. 3. Formally defer what the data can’t support — cross-sectional factors and crypto on-chain — until a survivorship-free cross-section and an on-chain feed are funded. Validate everything against dumb buy-and-hold SPY and 60/40, with ALFRED vintages, transaction costs, BDRY illiquidity, and after-tax turnover.

Confidence Level: Medium. Strong convergence on method and discipline; genuine, data-unresolvable divergence on whether any edge survives regime change.

One Question to Sit With: If your single highest-conviction signal (HY OAS credit spreads) only exists in your project back to mid-2023 and has therefore never once witnessed a recession, are you measuring credit-cycle risk — or your own confidence in a threshold imported from someone else’s recessions? And what in this 25-year, 3-regime dataset could ever tell you you’re wrong before real capital does?

B.5 Follow-up deliberation — ranking the highest-value indicators (and the target question)

Purpose. With the data layer fixed, the project asked a sharper question: which indicators are the highest value for a model meant to predict “which assets will rise the most and should be bought”? The same six experts reconvened. The debate immediately split on a premise hidden in the wording — “rise the most” (raw return) vs. the project’s Sharpe objective (risk-adjusted return) — and on whether anything in this universe ranks winners rather than merely gating risk.

B.5.1 🎭 The Quantitative Analyst

Position: The regime gauges only ever told you how much to be invested; the one signal that ranks which asset rises — and that survives even an ~11-name universe — is cross-sectional & time-series momentum (12-1m, vol-scaled). It is the highest-value return feature; everything else is the exposure dial. Key Risk: Momentum crashes (2009, March 2020) — unscaled, it maximizes both forward return and ruin. Surprising Insight: You don’t need a wide cross-section for momentum — ranking the 11 tradable proxies against each other recovers most of the documented premium without a French-Library Frankenstein.

B.5.2 🎭 The Political Scientist

Position: “Rise the most” is a raw-return target with poor construct validity here; what you can measure is risk-adjusted, regime-conditioned forward return. Ranking by nominal gain just re-labels the noisiest, highest-vol asset (crypto) as “best.” Key Risk: Optimizing the wrong label contaminates every downstream feature-importance claim. Surprising Insight: The honest “highest value” indicator set is the one that predicts the denominator (risk) at least as well as the numerator (return) — because that is the target Chapter 1 actually committed to.

B.5.3 🎭 The IR Theorist

Position: Momentum is real but regime-contingent — the regime layer decides whose momentum to trust (energy in a supply-shock reflation; gold/dollar in stagflation). Rank within the regime, not across it. Key Risk: Momentum’s meaning inverts at turning points exactly when ranking matters most. Surprising Insight: The new free leading-indicator block earns its place here not as alpha but as an early-warning on when momentum is about to stop working.

B.5.4 🎭 The Macro / International-Business Economist

Position: Highest value = a two-layer build: the reflation/stagflation quadrant (T10YIE vs DFII10) and copper/gold route capital to the right channel; momentum ranks inside it; the new claims/permits/orders/sentiment series finally let a diffusion-LEI proxy substitute for the licensed Conference-Board index. Key Risk: Treating the diffusion-LEI as a precise lead rather than a slow context gauge. Surprising Insight: 7 of the 10 LEI components are free — the licensed index was never the moat; the diffusion construction is.

B.5.5 🎭 The Market-Efficiency Contrarian

Position: “Predict the largest increases” is the return-chasing the EMH punishes; the only indicators worth ranking highly are the ones that keep you out of trouble. Momentum is the lone admissible return signal precisely because it is a documented anomaly, not a forecast. Key Risk: A raw-return target plus a momentum feature is a machine for buying tops. Surprising Insight: The “highest-value indicator” for a retail account is the gross-exposure dial, not any predictor — losing less in 2008/2022 dominates picking the 2021 winner.

B.5.6 🎭 The Risk Manager

Position: Rank indicators by their contribution to risk-adjusted P&L: vol-scaled momentum, the equal-weighted regime score, and VIX-driven volatility-scaling top the list; raw price-momentum and any raw-return label are disqualified. Key Risk: The target itself — a raw-return label makes position-sizing incoherent. Surprising Insight: Defining the label as forward return ÷ trailing vol quietly solves the “largest increases vs. Sharpe” fight: the model still ranks winners, but in units you can size.

B.5.7 Synthesis (follow-up)

Convergence: (1) Highest value is a two-layer structure — a regime/exposure gate (credit, curve, NFCI, Sahm, VIX → equal-weighted score) plus a vol-scaled momentum ranking layer over the tradable proxies. (2) The target must be risk-adjusted forward return, not raw “largest increases.” (3) The free leading-indicator block (claims, permits, starts, core-capex orders, sentiment, hours) is worth adding — to build a diffusion-LEI proxy and as early-warning context.

Core Tension: Raw “largest increases” vs. risk-adjusted return. Momentum + high-beta crypto maximize predicted increases but breach the drawdown ceiling; the regime complex maximizes risk-adjusted return but rarely names the single biggest winner. Resolved — not erased — by defining the label as forward return ÷ trailing volatility, which lets momentum rank winners in units you can size.

The Blind Spot: The target/label definition itself — everyone debated features, nobody had pinned down horizon, ranked-vs-threshold, or excess-over-SPY. The indicators you elevate are downstream of that choice; it is named explicitly in Data Understanding §5.1 and built in Data Preparation.

Confidence Level: Medium-High on the ranking and structure; Medium on the target framing, which only the project owner can finally settle.

One Question to Sit With: If your best “what-to-buy” signal (momentum) is also your biggest drawdown engine, are you building a model that predicts increases — or one that predicts the increases you can actually survive holding?

B.5.8 Editor’s note — what this deliberation changed

This follow-up drove three concrete changes: (1) the universe gained seven free FRED leading indicators (ICSA, PAYEMS, AWHAEMAN, PERMIT, HOUST, NEWORDER, UMCSENT) so a diffusion-LEI proxy is buildable; (2) Data Understanding §5.1 now ranks the highest-value indicators and reframes “largest increases” as risk-adjusted forward return; and (3) the Data Preparation phase constructs the agreed features — volatility-scaled momentum + cross-sectional rank, an equal-weighted regime_score, the reflation/stagflation quadrant, copper_gold_z, and the lei_diffusion proxy — with the label defined as forward return ÷ trailing volatility.