Tetlock-style habit: every substantive thesis / prediction gets attached confidence + a verifiable deadline. Score them six months later → see one’s own forecasting distribution, identify systematic over/under-confidence.

Trigger mechanism: when a substantive prediction / thesis statement surfaces in conversation, auto-prompt:

  • Confidence (10 / 25 / 50 / 75 / 90 — five tiers)?
  • Deadline (YYYY-MM-DD, specifically verifiable)?
  • Verification source?

Don’t log every micro-claim, only deployable claims with stakes (theses that drive position / action / a major judgment).

How to fill in confidence: see Calibration Methodology — 5-step process (decompose → base rate → inside view → premortem → bet test) + five-tier mental anchors + common biases. Don’t fill numbers from intuition — intuition is random noise.


Format

Every prediction:

  • Statement: claim content (specific, falsifiable)
  • Confidence: 10 / 25 / 50 / 75 / 90 % five tiers (avoid calibration noise)
  • Deadline: YYYY-MM-DD (specific)
  • Verification source: how to verify (e.g., earnings release, industry data, price level)
  • Date filed: filing date
  • Outcome: tbd → after the deadline, fill True / False / Partial + a short retrospective

Open predictions

Position-backed theses (active)

IDDate filedStatementConfDeadlineVerifyStatus
ACN-12026-05-18AI fails to capture 30–40% of ACN’s consulting/IT services shareTBD2027-Q1Gartner / IDC consulting-AI reports + ACN revenue trendOpen
DUOL-12026-05-18DUOL FY2026 reported revenue within 2% of mgmt guidance midpoint ($1.205B → $1.18-1.23B range); tests whether the thesis wound shows 12-month financial-statement impact (vs longer-horizon legs)75%2027-02 FY26 10-K releaseDUOL 10-KOpen (filed even though position cancelled — thesis-test independent of position)
DUOL-22026-05-18DUOL 12mo total return ≤ SPY 12mo total return (from 2026-05-18 close $112.06) — post-cancel framework prediction: revised expected return ≈ index midpoint, wounded legs add asymmetric downside50%2027-05-18DUOL vs SPY total returnOpen (cancel decision is endogenous to the prediction)
CHINA-12026-05-18By 2027-05-18, BOTH Qwen AND Yuanbao remain fully free, no personal paid subscription tier — Doubao’s 2026-05-04 paid experiment outcome insufficient to force cross-firm follow-on within 12mo75%2027-05-18Alibaba IR / Tencent IR / third-party internet & tech media verifyOpen
TLN-12026-05-13TLN underperforms SPY over 12 months (compound thesis: AI-bubble late stage + rising rates, high leverage amplifies)TBD2027-05-13TLN total return vs SPY total returnOpen
MSFT-12026-05-18“Within Mag 7, MSFT carries the lowest AI risk” thesis holds (relative outperform AMZN + GOOG)TBD2026-12-31Total return MSFT vs (AMZN, GOOG) averageOpen
SPGI-12026-05SPGI’s AI-threat narrative is mispriced; outperforms broad market over 12 monthsTBD2027-05-18SPGI vs SPY returnOpen
LDOS-12026-05-17LDOS defer decision was correct: within 12 months LDOS does not materially outperform the ITA basketTBD2027-05-17LDOS vs ITA returnOpen

Sector / macro theses (active)

IDDate filedStatementConfDeadlineVerifyStatus
GOLD-12026-05GLDM fiat-debasement thesis: outperforms USD cash by ≥ 3% over 12 monthsTBD2027-05-18GLDM return − cash yieldOpen
AI-12026-05-18The “AI capturing 30%+ of SP500 company workflow” narrative will not be validated by aggregate SP500 productivity data within 12 monthsTBD2027-05-18BLS productivity data + SP500 SG&A trendsOpen
AI-22026-05-18AI-narrative basket (NVDA + AMD + AVGO + ASML + VST + CEG) underperforms SPY over 12 months (short-term bearish; verified forcing functions: SpaceX June IPO liquidity drain $240B+ combined June-yr-end + Committee 9Q Red + Iran supply-shock CPI/PPI still transmitting + CME FedWatch 35% Dec hike priced; RSP-SPX trajectory April reversal restarts divergence; hyperscaler $175B debt issuance at stagflation rates)65%2027-05-18Custom basket return vs SPYOpen
AI-32026-05-18SP500 GDP-per-capita / aggregate productivity growth will not accelerate materially within 12 months (AI still converting existing demand, not creating new incremental demand)TBD2027-05-18BLS productivity + GDP per capita dataOpen

Cable MVNO / telecom

IDDate filedStatementConfDeadlineVerifyStatus
TEL-12026-05-18Cable MVNO 2026 net-add share remains ≥ 40% of the industryTBD2027-Q1 industry summaryLightreading / Fierce Network industry net addsOpen

Closed predictions (scored)

(empty — once an outcome is in, move from Open to here with Final Outcome + retrospective)


Quarterly review

Next review: 2026-08-17

Process:

  1. For every prediction past its deadline → resolve True / False / Partial
  2. Move to Closed
  3. Compute hit rates by confidence bucket:
    • 90% predictions should hit ~ 90%
    • 75% predictions should hit ~ 75%
    • 50% predictions should hit ~ 50%
    • and so on
  4. Bias identification: systematically over-confident (e.g., 90% only actually right 70% of the time)? Under-confident? Break down by domain (single-stock vs sector vs macro)?
  5. Adjust calibration habits (e.g., if 75% on single-stock is actually 50% → proactively derate confidence)

Anti-patterns to avoid

  • Vague predictions (“XX will go up”) — must be specific, falsifiable, with deadline
  • Retroactively explaining after the deadline why the outcome doesn’t count — outcome is outcome
  • Hindsight modification of the originally filed confidence — once filed, confidence is an immutable record
  • Selective logging (only log the ones I’m confident about) → biases the calibration sample
  • Logging every comment → noise; only log substantive deployable theses
  • Process / behavioral patterns ≠ predictions — framework drift (construct drift), Section 5 mis-application, and similar process learnings do not get filed in calibration; track them via feedback notes. Calibration is restricted to falsifiable forecast claims.
  • 2026-05-18 noted methodology meta-pattern: framework drift can invalidate prior confidence. When a fabricated framework is removed, any confidence derived under that framework must be re-derived under the correct framework — NOT by retroactively modifying filed confidence, but by filing a new prediction under the new framework if the decision changes.