# Four-Factor Macro Tail-Risk Model

Companion white paper for the dashboard

Version: v0.7

Date: 2026-05-26

Dashboard path: `index.html`

Machine-readable summary payload: `model_summary.json`

Full dashboard payload: `dashboard_payload.json`

## Abstract

The model began as a liquidity-impulse thesis: financial regimes can become self-reinforcing when earnings growth, credit creation, buying power, asset prices, and margin capacity reinforce each other. The empirical work narrowed that idea.

The primary model is now a decile rank model. This is more legible than a named-regime model because each decile contains roughly the same number of historical months. The question becomes simple:

```text
Where are we in the historical distribution of macro tail-risk pressure?
```

Most of the distribution is noisy. The useful signal is concentrated in the upper tail, especially the ninth and tenth deciles.

## Epistemology And Scope

This model is not trying to forecast the next recession catalyst. It is trying to measure whether the equity market is being priced and financed in a way that leaves it fragile if a catalyst arrives.

That distinction matters. The [Federal Reserve's financial-stability framework](https://www.federalreserve.gov/publications/2022-may-financial-stability-report-framework.htm) separates shocks from vulnerabilities: shocks are often surprises and hard to predict, while vulnerabilities build over time and determine how badly the system can respond under stress. That is also the working distinction in this dashboard. COVID, an AI capex cycle turn, a geopolitical oil shock, or a bank run can be the visible trigger. The model is aimed at the pre-existing risk-compensation and leverage state, not the narrative form of the trigger.

The model therefore makes a narrower claim:

```text
We usually cannot predict the shock.
We may be able to measure whether the market is priced and levered in a way
that makes disappointment more dangerous.
```

This is why the dashboard is framed as a tail-risk and beta-budgeting tool, not a general return forecast. In most months, the model should not be expected to have strong signal. The historical evidence is concentrated in upper-tail states where several partially distinct causal pressures are elevated at the same time.

The practical use case is therefore:

```text
How much long beta do I want?
How much long-volatility / put exposure do I want?
Is the current setup historically ordinary, elevated, or extreme?
```

It is not:

```text
Can this model forecast next month's return?
Can this model identify the exact exogenous shock?
Can this model replace security-level analysis?
```

## Causal Intuition

The original hypothesis was a liquidity-cycle mechanism:

```text
earnings growth and optimism
-> credit creation and financing capacity
-> buying power
-> asset-price appreciation
-> higher collateral and margin capacity
-> more buying power
```

That loop can be rational for a long time. If growth expectations are genuinely improving, low risk premiums and rapid capital formation may be justified. The danger comes when the same loop pushes risk compensation low and leverage high enough that a modest disappointment can have an outsized effect.

The mechanism is close to the leverage-cycle literature: [Adrian and Shin](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2245451) emphasize that credit availability can vary through intermediary leverage, while [Baron and Xiong](https://academic.oup.com/qje/article/132/2/713/30637569) argue that credit expansions can coincide with neglected crash risk. The final model keeps that intuition but stops pretending that "liquidity" is one measurable scalar. The literature and the local backtests both argued against that simplification. Credit impulse, public liquidity, bank balance-sheet capacity, securities leverage, spreads, and valuation all move through related but different channels.

The model therefore uses four causal legs:

| Leg | Dashboard Factor | Causal Role |
| --- | --- | --- |
| Equity risk compensation | Synthetic ERP tightness | Are equity holders being paid enough for risk, after rates and expected growth? |
| Credit risk compensation | Baa spread tightness | Are credit investors accepting unusually little compensation for default and liquidity risk? |
| Balance-sheet saturation | Broad stock saturation | Is broad non-broker debt/leverage stock stretched relative to history? |
| Demand-side securities leverage | Broker-customer leverage saturation | Is securities financing already heavily used by customers and leveraged buyers? |

The dashboard is intentionally about the interaction between these legs. A single elevated factor can be noisy. Several elevated factors at once are more concerning because they describe a market with lower compensation, tighter credit pricing, and less unused balance-sheet elasticity.

## Why We Do Not Test Every Plausible Factor

Macro backtests are a small-sample problem. The post-1960 monthly dataset sounds large, but true independent crisis episodes are scarce. Twelve-month forward outcomes overlap. Many macro series revise. If we test dozens of plausible indicators, then reweight the winners, then add interactions, then tune thresholds, we will almost certainly create a model that explains history better than it will help in real time.

The research policy is:

```text
1. Start with a causal mechanism before seeing the result.
2. Prefer public, refreshable data.
3. Prefer simple primitives over fitted composites.
4. Normalize with the same rolling z-score discipline where appropriate.
5. Use live prior thresholds for historical flags.
6. Compare against simple benchmarks.
7. Check overlap so breadth is not just double-counting one latent variable.
8. Reject or quarantine candidates that only work after repeated tuning.
```

This is not anti-empirical. Data should change priors. The rule is that data can discipline a first-principles hypothesis; it should not let us manufacture a new hypothesis after seeing every table. That is why the final dashboard keeps a small number of pre-specified causal factors and uses a simple 75/25 score-plus-breadth calibration instead of a fitted regression.

The local research notes preserve rejected or diagnostic work instead of silently deleting it. Growth variables, forecast revisions, earnings-expectation proxies, NFCI/yield-curve variants, and pure liquidity impulse screens were researched. Some were directionally interesting. They were not promoted because they were too noisy, too overlapping with ERP, too coincident with already-visible stress, too short-history, or too dependent on ex-post variable selection.

## Signal, Explanation, And Calibration Layers

The dashboard separates three concepts that are easy to confuse.

```text
Explanation layer:
  the economic story behind the factors and historical episodes

Signal layer:
  the four normalized factor scores

Calibration layer:
  historical deciles and forward-outcome base rates
```

The explanation layer is necessary because the model should not be a black box. The signal layer is deliberately simple: four comparable z-scores, equal-weighted. The calibration layer asks what happened historically when the 75/25 score-plus-breadth rank was in a similar zone.

The old named-regime taxonomy was useful for research but too verbose as a user-facing primary output. The decile model is cleaner:

```text
D1-D7: mostly noisy
D8: elevated watch zone
D9: elevated tail-risk zone
D10: historically ugly zone
```

That language is less satisfying than a precise forecast, but it is more honest.

## Current Read

Current model date: `2026-04-30`

```text
Primary decile:       D9
75/25 risk rank:      85.4%
4-factor score:       +0.64
Score rank:           83.1%
Breadth rank:         92.1%
Breadth:              2/4 active factors
```

Current factor states:

| Factor | Score | Live Flag Threshold | Flagged |
| --- | ---: | ---: | --- |
| Synthetic ERP tightness | +1.32 | +1.16 | Yes |
| Baa spread tightness | +0.85 | +1.14 | No |
| Broad stock saturation | -0.89 | +0.58 | No |
| Broker-customer leverage | +1.28 | +0.68 | Yes |

The current state is elevated but not extreme. At 85.4%, it is inside D9, whose boundary is 80-90%. It remains below D10, where historical outcomes become most clearly ugly.

## Current Interpretive Overlay

This section is deliberately separated from the model definition. It records the current Bayesian / qualitative overlay on top of the statistical read. The model output is the model output; this section is the human interpretation of why today's D9 may or may not deserve a more aggressive risk-management response.

The strict model read is:

```text
Current state: elevated, not extreme.
Current rank: 85.4%, inside D9.
Empirical signal: nonlinear, with the strongest historical evidence in D10.
```

The working interpretation is that D9 is a caution zone, not an automatic de-risking zone. The historical tables suggest that the first seven or eight deciles are mostly noisy, D9 starts to show higher volatility and drawdown risk, and D10 is where the model's empirical signal becomes much sharper. In risk-budgeting language:

```text
D9:
  stay awake, avoid complacency, consider efficient convexity

D10 / high D10:
  stronger case for reducing beta or increasing hedge intensity
```

This matters because the current D9 is not a full-stack leverage signal. The active pressure is mostly market-side:

| Channel | Current State | Interpretation |
| --- | --- | --- |
| Synthetic ERP tightness | Flagged | Equity risk compensation is thin. |
| Baa spread tightness | Elevated, below flag | Credit is somewhat complacently priced, but not fully flashing. |
| Broad stock saturation | Not flagged | Broad non-broker debt/leverage stock is not the current risk source. |
| Broker-customer leverage | Flagged | Securities financing / market leverage is hot. |

The broad-stock leg being benign does not mechanically prevent D10. Because the model is 75% magnitude and 25% breadth, large upward moves in ERP tightness, Baa tightness, and broker-customer leverage can push the average score into the top historical decile even if broad stock saturation remains low.

Example:

```text
ERP tightness:              +2.5z
Baa tightness:              +2.0z
Broad stock saturation:     -0.8z
Broker-customer leverage:   +2.5z

Equal score = (+2.5 + 2.0 - 0.8 + 2.5) / 4 = +1.55z
```

That would likely be a high historical score. With three of four factors flagged, the breadth leg would also contribute. So the model can describe a market-side danger regime without requiring the whole real economy to be overlevered.

The main speculative overlay is that the current AI cycle may allow an elevated regime to persist or intensify before it breaks. The closest historical rhyme is the dot-com period: a transformative technology narrative, major capex build-out, public-market speculation, tight risk compensation, and eventual uncertainty over realized ROIC. The current AI regime differs in at least three important ways:

1. Broad non-broker balance-sheet saturation is not currently as stretched as it was around the dot-com/GFC-style episodes.
2. The perceived total addressable market and macro impact of AI may be much larger than the internet narrative appeared at comparable stages.
3. The market-side speculative channel can be strong even when broad debt/GDP is not the binding constraint.

This overlay does not say that AI will succeed, that the current market is not a bubble, or that D9 is harmless. It says that if AI remains a powerful capex, productivity, earnings-expectation, and animal-spirits driver, then a D9 regime can plausibly continue and potentially migrate toward D10 before the strongest risk-off signal appears.

Near-term narrative catalysts can also matter. As of May 26, 2026, public reporting pointed to active Iran war/deal negotiations, a large AI mega-cap IPO pipeline, and partial but unresolved Ukraine ceasefire dynamics. Those are not model inputs, and they are unstable. They belong in the overlay layer, not the signal layer.

Current qualitative risk posture:

```text
Model-only:
  elevated tail-risk, not screaming danger

Human overlay:
  AI / IPO / animal-spirit regime may have room to run

Risk-management implication:
  do not treat D9 as automatic risk-off
  consider efficient convexity if attractively priced
  escalate hedging if the rank moves toward 95%+, breadth rises,
  or Baa tightness / ERP / broker-customer leverage blow out further
```

## Primary Model

The dashboard's primary scalar is:

```text
primary risk rank =
  0.75 * four-factor score percentile
+ 0.25 * breadth percentile
```

Then that rank is mapped into equal-sized historical deciles.

This is not a pure flag model. The continuous score carries most of the weight. Breadth is included because the empirical work suggested that multiple partially independent causal pressures firing together is more informative than one large z-score in isolation.

The 75/25 split is a deliberately modest concession to breadth. A pure score model can hide the difference between one very extreme factor and several moderately elevated factors. A pure flag-count model throws away too much magnitude information. The current structure keeps magnitude dominant while still recognizing that breadth is part of the causal story:

```text
75% = how elevated is the average causal pressure?
25% = how many distinct causal pressures are active?
```

This is not meant to be the mathematically optimal weighting. It is meant to be a disciplined, interpretable weighting that does not require fitting coefficients to a small number of historical crisis episodes.

The four-factor score is:

```text
4-factor score =
  average(
    synthetic ERP tightness,
    Baa spread tightness,
    broad stock saturation,
    broker-customer leverage saturation
  )
```

## How The Calculation Works

The dashboard updates once the input data are refreshed. It cuts the model at the latest completed month, not the current partial month. That avoids treating an incomplete month as if it were a true month-end observation.

The model turns raw macro and market series into comparable z-scores. A z-score means:

```text
z-score = (current value - trailing average) / trailing standard deviation
```

The trailing window is 120 months where possible, with at least 36 months required. Scores are clipped at +/-3 so that one extreme historical observation cannot dominate the model.

Example interpretation:

```text
+1.0z = about one trailing standard deviation above normal
 0.0z = around trailing normal
-1.0z = about one trailing standard deviation below normal
```

The sign is adjusted so that higher always means more tail-risk pressure:

```text
low ERP            -> high ERP tightness score
tight Baa spread   -> high Baa tightness score
high broad stock saturation -> high saturation score
high broker leverage -> high broker-customer leverage score
```

Each month therefore has four comparable factor scores. The dashboard averages them into a four-factor score.

The dashboard also creates four factor flags. A factor is flagged when it is above its own live prior 80th percentile. "Live prior" matters: the threshold at a historical date only uses information that would have existed before that date.

```text
factor flag = current factor >= prior 80th percentile
breadth = count of active factor flags
```

Then the model converts the continuous score and breadth into percentile ranks:

```text
score rank   = where today's four-factor score sits vs prior history
breadth rank = where today's breadth count sits vs prior history
```

The final rank is:

```text
primary risk rank =
  0.75 * score rank
+ 0.25 * breadth rank
```

That final rank is mapped into deciles:

```text
D1  = 0-10%
D2  = 10-20%
...
D9  = 80-90%
D10 = 90-100%
```

This is the main dashboard output. The decile tells the user where the current macro risk setup sits in the historical distribution.

There are two related but different ranking modes in the dashboard:

```text
Validation rank:
  Uses live prior history and strict filters.
  This is the main evidence object.

Historical chart rank:
  Uses the frozen current formula to show older episodes on one common scale.
  This is a visualization and event-study aid.
```

That distinction avoids a common dashboard error. The historical chart is meant to make the drawdown map legible; the validation tables are the place to judge predictive signal.

## The Four Inputs

### 1. Synthetic ERP Tightness

```text
synthetic ERP tightness = -rolling_z(synthetic ERP)
```

Lower ERP means investors are receiving less compensation for bearing equity risk. ERP is preferred to CAPE as the primary valuation variable because it directly incorporates rates and a growth assumption. CAPE remains a benchmark diagnostic.

Raw concept:

```text
equity risk premium = implied expected equity return - risk-free rate
```

The model wants to know whether equity investors are being paid enough for risk. When ERP is low, future equity returns have less cushion against disappointment. That is why the sign is inverted:

```text
lower ERP -> richer valuation -> higher risk score
```

The official monthly Damodaran ERP history is short, so the dashboard uses a synthetic Damodaran-style monthly backcast. The backcast uses annual Damodaran ERP anchors, monthly S&P price movement, monthly Treasury rates, and a growth/cash-flow reconstruction. It is not a naive interpolation; monthly price and rate moves affect the signal.

This choice is partly empirical and partly first-principles. CAPE has a deeper literature and a longer clean history, but it is mechanically backward-looking and does not explicitly include the risk-free rate or forward growth. Implied ERP is theoretically closer to the object we care about: the compensation investors demand after incorporating rates and expected cash-flow growth. The synthetic backcast is labeled clearly because it is model-derived, but it is preferred because the underlying economic object is better matched to the question.

### 2. Baa Spread Tightness

```text
Baa tightness = -rolling_z(Baa spread)
```

Tight spreads mean credit risk is priced cheaply. The model treats this as low risk compensation, not as immediate stress.

Raw concept:

```text
Baa spread = Moody's Baa corporate yield - 10-year Treasury yield
```

Wide spreads usually mean credit stress is already visible. This model is using the opposite condition: unusually tight spreads. Tight spreads can be benign, but at extremes they indicate that credit investors are accepting little compensation for default/liquidity risk.

The sign is inverted:

```text
lower Baa spread -> tighter credit pricing -> higher risk score
```

This factor is not saying "credit is currently breaking." It is saying "credit risk is priced with less cushion."

This is intentionally the opposite of a standard credit-stress screen. Wide spreads often mean stress is already visible. Tight spreads can be a late-cycle risk-compensation signal: credit is calm, capital is available, and investors are accepting low compensation. That can support markets for a while, but it also means less cushion if defaults, funding costs, or growth expectations disappoint.

### 3. Broad Stock Saturation

```text
broad stock saturation =
  0.294 * broad debt / GDP z
+ 0.235 * nonfinancial business debt / GDP z
+ 0.235 * NFCI leverage subindex z
+ 0.235 * NFCI nonfinancial leverage subindex z

broad stock saturation score = rolling_z(broad stock saturation)
```

This is the broad non-broker balance-sheet saturation leg. It deliberately excludes FINRA margin debt, margin utilization, broker-dealer customer receivables, and broker-customer leverage growth. Those securities-financing inputs belong to the fourth factor.

The causal object is:

```text
Is the broad economy / financial system debt stock stretched,
excluding the broker-customer securities-financing channel?
```

Its components are:

```text
broad debt / GDP:
  all-sector debt securities and loans relative to nominal GDP

nonfinancial business debt / GDP:
  corporate and business debt securities and loans relative to nominal GDP

NFCI leverage:
  Chicago Fed financial-condition leverage subindex

NFCI nonfinancial leverage:
  Chicago Fed nonfinancial leverage subindex
```

The weights are coarse causal priors, not fitted coefficients. They are close to equal weight. Broad debt/GDP receives a small overweight because it is the broadest total-system stock variable. The other three inputs split the remaining weight across business leverage, financial-condition leverage, and nonfinancial leverage.

The important methodological change in v0.7 is the input firewall:

```text
Factor 3 owns broad non-broker stock saturation.
Factor 4 owns broker-customer securities leverage.
The same primitive should not vote in both places.
```

Prior easing was tested as both a 50/50 state/path composite and as a separate fifth factor. It remains causally plausible, but it was not additive enough to promote into the primary score. It is better kept as a diagnostic note than smuggled into broad saturation with a compromise weight.

### 4. Broker-Customer Leverage Saturation, Composite B

Composite B measures broker-customer securities financing saturation:

```text
leverage impulse anchor =
  average(
    stitched FINRA-like margin debt 12m growth z,
    Z.1 broker-dealer customer receivables 12m growth z
  )

broker-customer stock =
  Z.1 broker-dealer customer receivables / GDP z

Composite B =
  average(leverage impulse anchor, broker-customer stock)
```

Before direct FINRA margin-debt history is available, the model bridges with Z.1 broker-dealer customer receivables using the FINRA/Z.1 overlap ratio. This is a z-score timing bridge, not a precise historical dollar-level margin-debt claim.

Raw concept:

```text
broker-customer financing = securities credit and receivables tied to customer leveraged exposure
```

This factor is meant to capture the demand-side leverage channel: when customers and leveraged investors have already expanded financing aggressively, the market has less unused balance-sheet elasticity. That matters for your original "rubber band" intuition. A stretched financing stock can support upside while the process continues, but it can also make the system more sensitive when price or funding conditions reverse.

Composite B has two blocks:

```text
flow / impulse:
  is broker-customer leverage growing unusually fast?

stock / saturation:
  is broker-customer leverage high relative to GDP?
```

The model averages these two blocks so it does not treat growth and level as four separate hidden votes. The conceptual split is simply:

```text
how fast is leverage expanding?
how stretched is the stock?
```

Composite B was added only after a separate read-only evaluation showed that it was both causal and reasonably independent from the three-factor stack. The important methodological point is not that margin data is magical. It is that broker-customer leverage measures a different part of the system from ERP or credit spreads: actual securities-financing demand and saturation.

## Why Candidate Factors Were Rejected Or Kept Diagnostic

The dashboard is allowed to have supporting diagnostics without promoting them into the primary model. That is how the research process avoids both underfitting and overfitting.

### Pure Liquidity Impulse

The original liquidity-impulse thesis was directionally plausible, especially through the [Biggs, Mayer, and Pick](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1595980) distinction between the stock of credit and the flow of new credit. The problem was empirical robustness. Pure liquidity impulse screens were suggestive but not strong enough as standalone forward predictors. Stress-confirmed versions worked better, but those often included market stress that was already visible.

So liquidity impulse remains intellectually important and diagnostically useful, but it is not the headline score.

### Growth And Growth Expectations

Growth variables were researched because they are the main possible justification for high valuations, tight spreads, and high leverage. In principle, a market can look stretched because the future really is unusually good. This is related to the broader [growth-at-risk](https://www.aeaweb.org/articles?id=10.1257%2Faer.20161923) literature, which treats financial conditions as affecting the distribution of future growth rather than only the mean.

The screen split growth into realized growth, growth impulse, and expected-growth/revision variables. The result was not strong enough to justify a fifth primary factor. Realized GDP moved too slowly. Growth deceleration variables were directionally interesting but noisy. Expected-growth and earnings-revision variables often overlapped with ERP because ERP already embeds growth expectations.

The conclusion is:

```text
Growth matters conceptually.
The available public growth/revision proxies did not add enough independent signal.
```

### Profit Share

Corporate profit share was interesting because it can represent margin-cycle saturation: high profits may be cyclical, politically contested, or vulnerable to mean reversion. It helped some monotonicity tests, but the later four-factor calibration did not clearly need it. It also overlaps with the growth and valuation story that ERP already partially captures.

Profit share remains a research candidate, not a primary factor.

### Yield Curve, NFCI, Volatility, And Market Stress

These variables can be useful diagnostics, but they are dangerous as primary predictors for this specific dashboard. If a factor mostly activates after the market is already in freefall, then it may improve a backtest while weakening the model's epistemic claim. The current dashboard wants pre-stress risk-compensation and leverage conditions, not just confirmation that stress is already happening.

This is why the dashboard can show recession bands, drawdowns, and funding context without giving them equal status inside the primary score.

## Historical Calibration

The model is evaluated with current-vintage public data and strict ex-ante filters:

```text
Nasdaq not already in >10% drawdown
prior 3-month Nasdaq return > -10%
credit stress <= +0.5z
market fragility <= +0.5z
```

Current D9 outcomes versus base rates:

| Asset | D9 Months | Avg 12M Return | Base Avg 12M Return | 12M DD <= -10% | Base DD <= -10% | 12M Loss <= -10% | Base Loss <= -10% |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| NASDAQ | 17 | +17.0% | +10.9% | 29.4% | 20.6% | 0.0% | 5.4% |
| SP500 | 17 | +12.1% | +9.1% | 17.6% | 15.2% | 0.0% | 3.9% |
| DOW | 17 | +8.6% | +7.9% | 11.8% | 15.2% | 0.0% | 3.0% |

The current D9 read is mixed. Average forward returns in this bucket were not bad, but volatility and drawdown odds were elevated, especially for NASDAQ. This is exactly why the dashboard should be read as a tail-risk model rather than a point-return model. D10 remains the bucket where both drawdown risk and average returns become clearly ugly.

The SP500 row now uses a merged long/current monthly price series rather than the shortened public FRED response history. This makes the SP500 validation sample comparable to NASDAQ and DOW instead of being dominated by recent history.

## Historical Drawdown Event Map

The dashboard now includes a historical event map for interpretation. It uses the S&P 500 as the broad market drawdown tape, overlays NBER recessions, and plots the frozen 75/25 score-plus-breadth risk rank against actual market drawdowns.

The event table is not a separate model. It is a presentation layer asking:

```text
Before the S&P 500 first crossed -10% from a local peak,
had the 75/25 risk rank already moved into an elevated zone?
```

The table uses a historical calibrated rank so older episodes can be reviewed under the frozen current formula. The rule is:

```text
Yes     = max prior 12-month 75/25 rank >= 80%
Partial = max prior 12-month 75/25 rank between 70% and 80%
No      = max prior 12-month 75/25 rank below 70%
N/A     = model inputs unavailable
```

| Event | Crossed -10% | Trough | Max Drawdown | Pre-Signal | Max Prior Rank | Ex-Post Catalyst |
| --- | --- | --- | ---: | --- | ---: | --- |
| 1990 recession / Gulf War | 1990-09-30 | 1990-10-31 | -14.8% | No | 57.6% / D6 | Oil shock from Iraq's invasion of Kuwait, late-cycle tightening, S&L stress, and the 1990-91 recession. |
| Russia / LTCM crisis | 1998-09-30 | 1998-09-30 | -11.8% | Yes | 96.3% / D10 | Asian crisis aftershocks, Russia default, LTCM deleveraging, and a brief global funding shock. |
| Dot-com bust | 2000-12-31 | 2001-09-30 | -29.7% | Yes | 96.0% / D10 | Technology valuation unwind, capex reversal, recession, and accounting scandals. |
| Global Financial Crisis | 2008-01-31 | 2009-03-31 | -50.8% | Yes | 83.0% / D9 | Housing bust, structured-credit losses, bank/dealer leverage unwind, and funding-market seizure. |
| Eurozone crisis / US downgrade | 2011-08-31 | 2011-09-30 | -15.5% | No | 22.4% / D3 | Eurozone sovereign crisis, US debt-ceiling shock, S&P downgrade, and global growth fear. |
| Q4 2018 Fed / trade shock | 2018-12-31 | 2018-12-31 | -14.0% | Yes | 84.9% / D9 | Fed tightening, quantitative tightening, US-China trade war, and growth scare. |
| COVID shock | 2020-03-31 | 2020-03-31 | -20.0% | Partial | 78.2% / D8 | Exogenous pandemic shock, lockdowns, forced deleveraging, and emergency policy response. |
| Inflation / Fed-hike bear market | 2022-04-30 | 2022-09-30 | -24.8% | Yes | 99.8% / D10 | Inflation shock, aggressive Fed hiking, rate-duration repricing, and valuation compression. |

The important epistemic point is the COVID row. The model did not and could not forecast the specific pandemic catalyst. But the system was already close to an elevated historical risk zone before the shock. That is consistent with the model's intended use: it is not a shock forecaster; it is a fragility and risk-compensation monitor.

## NASDAQ Primary Deciles

| Decile | Months | Avg Rank | Avg Score | Avg Breadth | Avg 12M Return | 12M DD <= -10% | 12M Return <= -10% |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| D1 | 22 | 9.3% | -0.96 | 0.00 | +13.7% | 4.5% | 0.0% |
| D2 | 21 | 17.3% | -0.66 | 0.00 | +14.2% | 23.8% | 0.0% |
| D3 | 22 | 24.8% | -0.51 | 0.09 | +12.4% | 27.3% | 0.0% |
| D4 | 21 | 32.5% | -0.23 | 0.00 | +18.0% | 0.0% | 0.0% |
| D5 | 21 | 46.0% | -0.07 | 0.76 | +8.6% | 9.5% | 0.0% |
| D6 | 21 | 56.1% | +0.15 | 1.00 | +14.7% | 4.8% | 0.0% |
| D7 | 19 | 64.7% | +0.33 | 1.05 | +9.7% | 15.8% | 0.0% |
| D8 | 18 | 74.0% | +0.48 | 1.39 | +8.5% | 22.2% | 5.6% |
| D9 | 17 | 83.3% | +0.63 | 1.88 | +17.0% | 29.4% | 0.0% |
| D10 | 22 | 94.5% | +1.33 | 2.91 | -6.4% | 68.2% | 45.5% |

This is why the decile model is the cleanest dashboard language. The middle of the distribution is noisy. D8 and D9 are elevated risk zones, but not deterministic bearish forecasts. D10 is the clearest risk-off bucket.

## Tail Bands As Cross-Check

The named bands remain useful as a diagnostic, but they are no longer the primary model.

| Band | Rule |
| --- | --- |
| Normal | 0 flags and score below high threshold |
| Watch | 1 flag or high score |
| Elevated | 2 flags, or 1 flag plus high score |
| High risk | 3 flags, or 2 flags plus very-high score |
| Extreme | 4 flags, or 3+ flags plus extreme score |

The decile model is preferred because it is easier to read and uses equal-sized buckets.

## Caveats

This is a current-vintage public-data model, not a fully real-time trading simulation. Some input series revise, and quarterly data are carried forward until the next observation arrives.

Composite B is model-derived before direct FINRA data starts. It is useful for z-score timing, not for precise historical level claims.

The model does not forecast exogenous shocks. It estimates whether the market is in a historically stretched risk-compensation and leverage-saturation state if disappointment arrives.

The historical output should be read as base-rate evidence, not as a guarantee. D9 and D10 have been meaningfully worse than normal historically, but the middle deciles are noisy and even high-risk states can persist while markets keep rising.

## References And Source Notes

External research and data anchors:

| Topic | Source | How It Informs The Model |
| --- | --- | --- |
| Shocks vs vulnerabilities | [Federal Reserve Financial Stability Report framework](https://www.federalreserve.gov/publications/2022-may-financial-stability-report-framework.htm) | Supports the distinction between forecasting catalysts and measuring vulnerability. |
| Credit / vulnerability early warnings | [BIS early-warning indicators of banking crises](https://www.bis.org/publ/qtrpdf/r_qt1803e.htm) | Supports the idea that credit, debt-service, and balance-sheet pressure are vulnerability indicators, not exact timing tools. |
| Credit impulse | [Biggs, Mayer, and Pick, Credit and Economic Recovery](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1595980) | Supports separating credit flow/impulse from credit stock. |
| Procyclical leverage | [Adrian and Shin, Procyclical Leverage and Value-at-Risk](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2245451) | Supports the balance-sheet amplification mechanism behind the leverage-cycle intuition. |
| Credit expansion and crash risk | [Baron and Xiong, Credit Expansion and Neglected Crash Risk](https://academic.oup.com/qje/article/132/2/713/30637569) | Supports the idea that credit expansion can lower perceived risk while increasing future crash risk. |
| Growth-at-risk | [Adrian, Boyarchenko, and Giannone, Vulnerable Growth](https://www.aeaweb.org/articles?id=10.1257%2Faer.20161923) | Supports targeting downside distributions rather than only mean returns or mean GDP. |
| Financial-conditions impulse | [Federal Reserve FCI-G](https://www.federalreserve.gov/econres/notes/feds-notes/financial-conditions-and-risks-to-the-economic-outlook-accessible20240920.htm) | Public analogue for measuring financial conditions as impulse to future growth. |
| Implied ERP data | [Aswath Damodaran ERP data page](https://pages.stern.nyu.edu/~adamodar//New_Home_Page/) | Anchor source for the official and annual ERP data used in the synthetic ERP backcast. |
| Baa spread | [FRED BAA10Y](https://fred.stlouisfed.org/series/BAA10Y) | Public source for Moody's Baa yield relative to the 10-year Treasury yield. |
| Margin statistics | [FINRA margin statistics](https://www.finra.org/rules-guidance/key-topics/margin-accounts/margin-statistics) | Public source for customer debit and credit balances in securities margin accounts. |
| Current Iran / market narrative | [AP market report, May 26 2026](https://apnews.com/article/71cc7b49f2ca3462a118878c93c75940) and [Axios Iran/oil note, May 26 2026](https://www.axios.com/2026/05/26/new-oil-order-iran-deal-prices) | Used only in the interpretive overlay, not the model. |
| AI mega-cap IPO pipeline | [Axios on SpaceX, OpenAI, and Anthropic IPOs, May 20 2026](https://www.axios.com/2026/05/20/spacex-openai-anthropic-ipos) | Used only in the interpretive overlay, not the model. |
| Ukraine ceasefire dynamics | [Guardian Ukraine war briefing, May 5 2026](https://www.theguardian.com/world/2026/may/05/ukraine-war-briefing-duelling-ceasefires-as-zelenskyy-floats-open-ended-truce) | Used only in the interpretive overlay, not the model. |

Internal project notes used to build this paper:

| Project Note | Role |
| --- | --- |
| `tools/macro_model/docs/DASHBOARD_REASONING_CHAIN_CONTEXT_v0.1.md` | Captures the research arc from liquidity impulse to four-factor tail-risk model. |
| `tools/macro_model/docs/ORTHOGONAL_FLAG_RESEARCH_PLAN_v0.1.md` | Defines candidate-admission policy and anti-overfit gates. |
| `tools/macro_model/docs/LIQUIDITY_IMPULSE_LITERATURE_REVIEW_v0.2.md` | Summarizes liquidity, credit impulse, global liquidity, and growth-at-risk literature. |
| `tools/macro_model/docs/SYNTHETIC_ERP_BACKCAST_v0.1.md` | Documents the synthetic ERP construction and validation against official monthly ERP. |
| `tools/macro_model/docs/VALUATION_SUBSTITUTION_COMPOSITE_TEST_v0.1.md` | Documents the CAPE vs ERP substitution decision. |
| `tools/macro_model/docs/BROKER_CUSTOMER_LEVERAGE_COMPOSITE_B_HANDOFF_v0.1.md` | Documents the fourth-factor candidate and its independence tests. |
| `tools/macro_model/docs/GROWTH_INTERACTION_SCREEN_v0.1.md` | Documents why growth variables stayed research-only. |
| `tools/macro_model/docs/EARNINGS_EXPECTATIONS_REVISION_PROXY_SCREEN_v0.1.md` | Documents the public earnings-expectations proxy work and why it was not promoted. |
| `tools/macro_model/docs/FORECAST_REVISION_SCREEN_v0.1.md` | Documents SPF forecast-revision work and its limitations. |
