Tetlock-style habit: every substantive thesis / prediction gets attached confidence + a verifiable deadline. Score them six months later → see one’s own forecasting distribution, identify systematic over/under-confidence.
Trigger mechanism: when a substantive prediction / thesis statement surfaces in conversation, auto-prompt:
- Confidence (10 / 25 / 50 / 75 / 90 — five tiers)?
- Deadline (YYYY-MM-DD, specifically verifiable)?
- Verification source?
Don’t log every micro-claim, only deployable claims with stakes (theses that drive position / action / a major judgment).
How to fill in confidence: see Calibration Methodology — 5-step process (decompose → base rate → inside view → premortem → bet test) + five-tier mental anchors + common biases. Don’t fill numbers from intuition — intuition is random noise.
Format
Every prediction:
- Statement: claim content (specific, falsifiable)
- Confidence: 10 / 25 / 50 / 75 / 90 % five tiers (avoid calibration noise)
- Deadline: YYYY-MM-DD (specific)
- Verification source: how to verify (e.g., earnings release, industry data, price level)
- Date filed: filing date
- Outcome: tbd → after the deadline, fill True / False / Partial + a short retrospective
Open predictions
Position-backed theses (active)
| ID | Date filed | Statement | Conf | Deadline | Verify | Status |
|---|---|---|---|---|---|---|
| ACN-1 | 2026-05-18 | AI fails to capture 30–40% of ACN’s consulting/IT services share | TBD | 2027-Q1 | Gartner / IDC consulting-AI reports + ACN revenue trend | Open |
| DUOL-1 | 2026-05-18 | DUOL FY2026 reported revenue within 2% of mgmt guidance midpoint ($1.205B → $1.18-1.23B range); tests whether the thesis wound shows 12-month financial-statement impact (vs longer-horizon legs) | 75% | 2027-02 FY26 10-K release | DUOL 10-K | Open (filed even though position cancelled — thesis-test independent of position) |
| DUOL-2 | 2026-05-18 | DUOL 12mo total return ≤ SPY 12mo total return (from 2026-05-18 close $112.06) — post-cancel framework prediction: revised expected return ≈ index midpoint, wounded legs add asymmetric downside | 50% | 2027-05-18 | DUOL vs SPY total return | Open (cancel decision is endogenous to the prediction) |
| CHINA-1 | 2026-05-18 | By 2027-05-18, BOTH Qwen AND Yuanbao remain fully free, no personal paid subscription tier — Doubao’s 2026-05-04 paid experiment outcome insufficient to force cross-firm follow-on within 12mo | 75% | 2027-05-18 | Alibaba IR / Tencent IR / third-party internet & tech media verify | Open |
| TLN-1 | 2026-05-13 | TLN underperforms SPY over 12 months (compound thesis: AI-bubble late stage + rising rates, high leverage amplifies) | TBD | 2027-05-13 | TLN total return vs SPY total return | Open |
| MSFT-1 | 2026-05-18 | “Within Mag 7, MSFT carries the lowest AI risk” thesis holds (relative outperform AMZN + GOOG) | TBD | 2026-12-31 | Total return MSFT vs (AMZN, GOOG) average | Open |
| SPGI-1 | 2026-05 | SPGI’s AI-threat narrative is mispriced; outperforms broad market over 12 months | TBD | 2027-05-18 | SPGI vs SPY return | Open |
| LDOS-1 | 2026-05-17 | LDOS defer decision was correct: within 12 months LDOS does not materially outperform the ITA basket | TBD | 2027-05-17 | LDOS vs ITA return | Open |
Sector / macro theses (active)
| ID | Date filed | Statement | Conf | Deadline | Verify | Status |
|---|---|---|---|---|---|---|
| GOLD-1 | 2026-05 | GLDM fiat-debasement thesis: outperforms USD cash by ≥ 3% over 12 months | TBD | 2027-05-18 | GLDM return − cash yield | Open |
| AI-1 | 2026-05-18 | The “AI capturing 30%+ of SP500 company workflow” narrative will not be validated by aggregate SP500 productivity data within 12 months | TBD | 2027-05-18 | BLS productivity data + SP500 SG&A trends | Open |
| AI-2 | 2026-05-18 | AI-narrative basket (NVDA + AMD + AVGO + ASML + VST + CEG) underperforms SPY over 12 months (short-term bearish; verified forcing functions: SpaceX June IPO liquidity drain $240B+ combined June-yr-end + Committee 9Q Red + Iran supply-shock CPI/PPI still transmitting + CME FedWatch 35% Dec hike priced; RSP-SPX trajectory April reversal restarts divergence; hyperscaler $175B debt issuance at stagflation rates) | 65% | 2027-05-18 | Custom basket return vs SPY | Open |
| AI-3 | 2026-05-18 | SP500 GDP-per-capita / aggregate productivity growth will not accelerate materially within 12 months (AI still converting existing demand, not creating new incremental demand) | TBD | 2027-05-18 | BLS productivity + GDP per capita data | Open |
Cable MVNO / telecom
| ID | Date filed | Statement | Conf | Deadline | Verify | Status |
|---|---|---|---|---|---|---|
| TEL-1 | 2026-05-18 | Cable MVNO 2026 net-add share remains ≥ 40% of the industry | TBD | 2027-Q1 industry summary | Lightreading / Fierce Network industry net adds | Open |
Closed predictions (scored)
(empty — once an outcome is in, move from Open to here with Final Outcome + retrospective)
Quarterly review
Next review: 2026-08-17
Process:
- For every prediction past its deadline → resolve True / False / Partial
- Move to Closed
- Compute hit rates by confidence bucket:
- 90% predictions should hit ~ 90%
- 75% predictions should hit ~ 75%
- 50% predictions should hit ~ 50%
- and so on
- Bias identification: systematically over-confident (e.g., 90% only actually right 70% of the time)? Under-confident? Break down by domain (single-stock vs sector vs macro)?
- Adjust calibration habits (e.g., if 75% on single-stock is actually 50% → proactively derate confidence)
Anti-patterns to avoid
- Vague predictions (“XX will go up”) — must be specific, falsifiable, with deadline
- Retroactively explaining after the deadline why the outcome doesn’t count — outcome is outcome
- Hindsight modification of the originally filed confidence — once filed, confidence is an immutable record
- Selective logging (only log the ones I’m confident about) → biases the calibration sample
- Logging every comment → noise; only log substantive deployable theses
- Process / behavioral patterns ≠ predictions — framework drift (construct drift), Section 5 mis-application, and similar process learnings do not get filed in calibration; track them via feedback notes. Calibration is restricted to falsifiable forecast claims.
- 2026-05-18 noted methodology meta-pattern: framework drift can invalidate prior confidence. When a fabricated framework is removed, any confidence derived under that framework must be re-derived under the correct framework — NOT by retroactively modifying filed confidence, but by filing a new prediction under the new framework if the decision changes.