2 Business Understanding

CRISP-DM Phase 1. Understand the project objectives and requirements from a business perspective, then convert this knowledge into a data-mining problem definition and a preliminary plan. See The CRISP-DM Process for the methodology overview.

This chapter documents the business logic of PortfolioLens. Because the “business” here is a personal investment portfolio, the “business logic” is personal investment logic — the explicit, written reasoning that governs how capital is allocated, what edge we believe we have, what we will not do, and how we will know whether any of it is actually working.

The reasoning in this chapter was stress-tested by a deliberate, adversarial debate among six domain perspectives (quantitative finance, political science, international-relations theory, international-business macroeconomics, market-efficiency skepticism, and risk management). That deliberation is preserved in full in the Advisory Council Deliberation appendix; this chapter is its distilled, decision-ready output.

2.1 Context & Background

The problem. Discretionary, narrative-driven personal investing is hard to evaluate and easy to fool yourself with. PortfolioLens reframes personal investing as an empirical, hypothesis-driven research project: identify patterns in objective data that precede market moves, translate them into rules, and validate those rules honestly before risking capital.

The thesis. Major geopolitical events — interstate conflict, invasions, sanctions, escalation — produce economic consequences that propagate through identifiable channels (energy, commodities, currencies, defense procurement, shipping, capital flows). If those consequences can be anticipated even slightly better than consensus, they create risk-adjusted opportunity. The chain we are betting on:

Geopolitical shock  →  economic transmission channel  →  asset re-pricing  →  opportunity
   (the event)          (energy/FX/commodities/defense)     (specific tickers)   (entry/exit)

A crucial refinement, drawn directly from the council: we do not trade the headline. The market reacts to the revision of expectations (the surprise), and the durable, retail-tradable signal lives in the transmission mechanism and its second-order effects, which can play out over weeks and months — not in the first-second reaction, which institutions own.

Scope. Personal portfolio, personal use, personal capital. The models built here are decision-support tools for one investor, not a product, fund, or service for others.

Investor profile. (Recommended defaults below — confirm and adjust to your situation.)

Attribute	Working assumption (confirm)
Account type	Personal taxable + any tax-advantaged accounts
Capital base	Risk capital the investor can afford to lose without lifestyle impact
Execution	Manual or semi-automated; not latency-competitive with institutions
Skill/tooling	Python-centric data science; this Quarto book as the research log
Time available	Part-time research; strategy must survive infrequent attention

2.2 Business Objectives

In plain language, the personal-investing goals — in priority order:

Don’t go broke. Preserve capital; avoid ruin and catastrophic drawdowns. This outranks return maximization.
Beat a passive benchmark on a risk-adjusted basis. Earn more per unit of risk than a simple buy-and-hold of the same asset mix would.
Capture geopolitical opportunity if it is real. Determine whether geopolitical signals add genuine, repeatable value over a conventional model — and exploit them only to the extent the evidence supports.
Build durable, reusable infrastructure. A documented, repeatable research and decision process that improves over time, rather than a one-off lucky call.

2.3 Business Success Criteria

Success is defined primarily in risk-adjusted terms, not raw return. Targets below are recommended starting points to be confirmed by the investor.

Criterion	Target (confirm)	Why
Primary — Sharpe ratio	Out-of-sample Sharpe ≥ 1.0, and meaningfully above the benchmark’s	Rewards return per unit of risk; resists “lucky high-volatility” outcomes
Max drawdown ceiling	Peak-to-trough ≤ 20–25%	Survivability and behavioral tolerance; a strategy you can actually hold
Benchmark to beat	Risk-adjusted return of a passive blend (e.g., a stock index + a fixed BTC/ETH sleeve matching the strategy’s average exposure)	Honest comparison: are we adding value over doing nothing clever?
Geopolitical lift	Geopolitical features improve OOS Sharpe by a statistically and economically meaningful margin over the baseline	The project’s central, falsifiable question
Capital-preservation rule	No single thesis risks more than a pre-set fraction of capital	Prevents one confident, wrong bet from undoing a year of gains

A “no” is a valid success. If rigorous testing shows geopolitical signals don’t add durable value, discovering that — before betting real money — is a successful Phase-5 outcome, not a failure.

2.4 The Investment Thesis / Business Logic

This is the heart of the chapter: why we believe an edge could exist and where to look. Conflict does not move “the market” uniformly — it moves specific instruments through specific channels. The business logic is to map channels to tradable assets, then test whether anticipating the channel beats consensus.

Transmission channel	What moves	Candidate instruments	Direction logic
Energy	Oil & gas supply/route risk	Energy producers, oil/gas futures & ETFs	Supply threat → price up; producers benefit
Agricultural commodities	Grain/fertilizer exporters disrupted	Wheat/corn, ag & fertilizer equities	Export disruption → price up
Defense procurement	Sustained re-armament spending	Defense primes & ETFs	Conflict → multi-year capex re-rating
Safe havens & FX	Flight to safety	USD, gold, Treasuries	Risk-off → safe assets bid
Shipping & insurance	Chokepoint/route risk (Hormuz, Suez, Black Sea, Taiwan Strait)	Tanker/shipping, freight rates	Route risk → rates & insurance up
Crypto (regime-dependent)	Flips by regime	BTC, ETH	Risk-on “tech beta” or capital-flight/sanctions hedge — context decides

Three principles the council insisted on:

Trade the surprise, not the event. The tradable moment is usually the escalation phase — when probabilities shift but consensus hasn’t repriced — or the second-order consequence, not the moment of invasion.
Trade the mechanism, not the narrative. “There’s a war” is not a trade. “Black Sea grain exports are blocked, so wheat and fertilizer names re-rate” is a trade.
Respect regime-dependence. The same event can hit crypto through opposite channels depending on context. A static, one-size model will be wrong; regime awareness is part of the design.

2.5 Strategic Approach — Baseline-First

Per the confirmed project decision, the geopolitical edge is treated as a hypothesis tested against a baseline, not an assumption:

Stage 1 — Baseline. Build a conventional, multi-horizon model on the conflict-sensitive sectors + major crypto universe using price/volume, fundamentals, and standard macro features. Establish its honest out-of-sample, cost-aware performance. This is the bar to beat.
Stage 2 — Geopolitical layer. Add quantified geopolitical/event features (e.g., GDELT / ACLED / ICEWS escalation signals) and measure the marginal lift over the baseline. Keep the features only if they add value that survives validation.

Horizon. A multi-horizon blend is targeted: short event-driven reactions (days–weeks), medium-term swing positioning on transmission channels (weeks–months), and long-term strategic tilts (months–years). The medium horizon is expected to be the most realistic source of retail edge; short-horizon claims face the highest skepticism and friction.

2.6 Situation Assessment

Resources

Market data: see Financial Data Sources — yfinance/ Tiingo for prototyping, SEC EDGAR for fundamentals, CCXT/CoinGecko for crypto, FRED for macro; point-in-time sources (e.g., Norgate) noted for rigorous backtests.
Geopolitical data: GDELT, ACLED, ICEWS (event/escalation streams); Correlates of War, Polity for structural context.
Tooling: Python data-science stack; this Quarto book as the documented research log; Git for version control.
Human: one part-time investor/researcher.

Assumptions

Geopolitical consequences propagate through identifiable, persistent channels.
A patient, medium-term horizon is reachable for a retail investor; the sub-second game is not.
Quality, point-in-time data is obtainable at acceptable cost.

Constraints

No latency/co-location edge; no institutional information access.
Limited research time; the strategy must tolerate infrequent attention.
Real frictions: spreads, slippage, and (notably) short-term capital-gains tax.

Preliminary cost/benefit

Costs: data subscriptions, research time, and the very real risk of losses if the edge is illusory or poorly executed.
Benefits: improved risk-adjusted returns if an edge exists; and regardless, a disciplined, documented process that replaces gut-feel investing with evidence.

2.7 Risks & Mitigations

Distilled from the council deliberation:

Risk	Description	Mitigation
Already priced in (EMH)	Macro geopolitics is the most-watched arena on earth; the move may be gone before a retail click	Target the slow transmission channel and escalation phase, not the headline; lean on the patience/time-horizon edge
Non-stationarity / regime change	The relationship found may be era-specific and simply stop working	Regime-aware design; rolling re-validation; theory-guided (not purely mined) features
Overfitting / data-snooping	With enough hypotheses, a fake edge always appears	Walk-forward, purged/embargoed, leakage-free validation; baseline-first; out-of-sample discipline
Fat-tail / gap risk	A predicted conflict de-escalates overnight; position gaps against you	Position sizing first; defined-risk structures; no oversized “obvious” bets
Execution friction	Costs, slippage, and short-term taxes can erase a paper edge	Transaction-cost-aware backtests; prefer longer holds; model taxes explicitly
Behavioral discipline	The operator overrides the model at the worst moment	Pre-committed rules; the model’s job is to constrain the human, documented here

2.8 Ethical & Personal Investment Policy

This project explicitly seeks to profit from events that include war and human suffering. That deserves a conscious, written stance rather than silent avoidance (the council flagged it as the easiest thing to ignore).

Recommended default policy (adjust to your values):

Trade broad economic consequences (energy, commodities, FX, indices, crypto) and conventional, widely-held defense exposure.
Treat any personal exclusions (specific sectors or companies the investor is unwilling to hold) as a hard constraint in the investable universe — documented here, enforced in code during Data Preparation.
Profiting from anticipated macro consequences is distinct from causing or influencing events; this strategy does neither.

Action: confirm your exclusion list (if any). It becomes a filter on the asset universe in Data Preparation.

2.9 Data-Mining Goals

Translating the business goals into technical targets the later chapters will pursue:

Business goal	Data-mining goal	Success metric
Beat benchmark on risk-adjusted basis	Forecast N-day-forward risk-adjusted return (or regime) per instrument across horizons	Out-of-sample Sharpe vs. benchmark
Determine if geopolitics adds value	Measure marginal lift of geopolitical features over the baseline model	OOS Sharpe lift; feature-importance robustness; statistical significance
Time entries/exits sensibly	Classify regime / direction with calibrated probabilities	AUC / hit rate; calibration; cost-aware backtest P&L
Don’t go broke	Risk model: drawdown, volatility, and position-sizing outputs	Max drawdown within ceiling; Kelly-fraction-bounded sizing

Validation discipline (non-negotiable): strict walk-forward with purge/embargo, no look-ahead (event features timestamped by availability, fundamentals by filing date), and all backtests net of realistic transaction costs.

2.10 Project Plan

Mapping the work onto the remaining CRISP-DM chapters:

Phase	Chapter	Phase-1 hand-off
2. Data Understanding	02_data_understanding	Acquire & profile market + geopolitical data; verify quality; first hypotheses
3. Data Preparation	03_data_preparation	Build point-in-time dataset; engineer baseline + geopolitical features; apply ethical universe filter
4. Modeling	04_modeling	Baseline model → add geopolitical layer; multi-horizon; walk-forward test design
5. Evaluation	05_evaluation	Judge against the Sharpe/drawdown success criteria; decide if geopolitical lift is real
6. Deployment	06_deployment	Repeatable signal/decision process; monitoring for regime change

Tooling & cadence: Python research notebooks rendered into this Quarto book; Git-tracked; iterative loops back to earlier phases as understanding deepens (CRISP-DM is not linear).