Open Problems
Every chapter maintains its own open-questions section; this page aggregates them. The five problems at the top are the ones we currently consider load-bearing for the whole field.
The five load-bearing problems
Section titled “The five load-bearing problems”- Can trustworthy resolution be extended faster than adversaries learn to game it? Retrodiction, selection protocols, and consistency floors all expand the resolvable set; all are gameable. The field is buildable if and only if this race is winnable (Robust Reasoning Processes).
- Is “cost per validated bit” well-defined across output types? Probabilities, rulings, models, and rankings need a common currency before the catalogue’s comparisons become measurements (The Process Catalogue).
- How much does consistency constrain correctness — especially under optimization pressure? The observed consistency–accuracy correlation comes from models not optimizing against the metric (Consistency Evaluations).
- Does composition multiply corruption costs, or merely add the weakest link? The constructive bet of the whole catalogue — cheap filters with randomized escalation to robust processes — is an unmeasured conjecture (The Process Catalogue, What Grounds an Oversight Protocol?).
- How fast does the corruption cost curve fall as attacker capability rises? The corruption-capacity curve, per process, is among the most decision-relevant unmeasured quantities in AI oversight (The Core Model).
By chapter
Section titled “By chapter”- Cruxes — the strategic uncertainties: scaffolding vs. base models, demand, differential safety, lock-in.
- Epistemic Impact Analysis — consumer-agent adequacy, divergence measures for rich beliefs, efficient profundity, adversarial information, valuing question discovery.
- Constructing Utility Functions — aggregating structural disagreement, weight drift and its auditors, required elicitation precision under optimization pressure.
- Untrustworthy Sources — whether the deception conjunction is complete, when reproduction fails to close the gap under correlated error, whether “deception affordance” is the right central object, whether form-level robustness generalizes, composed arguments, Goodhart dynamics once the map is public, and how much design-time advantage survives when the defender cannot credibly commit.
- The Process Catalogue — the common currency problem, corruption-cost stability over time, intrinsic vs. infrastructure-artifact weaknesses.
- What Grounds an Oversight Protocol? — head-to-head empirics across groundings, retrodiction’s contamination ceiling, collusion in peer prediction, who controls decomposition, cheap conservative routing out of the deception conjunction.
- Consistency Evaluations — consistency-vs-correctness, the cost-per-bit portfolio of checks, attacker–defender equilibrium.
- What Is a Strong Reasoner? — which properties are load-bearing, cross-domain trust transfer, track records across model versions.
- The Reliability Ladder — whether the tiers are real, what tier 5’s minimal verification standard is, whether lower tiers can bootstrap upper ones.
- Overseeing Automated Research — provable incentive guarantees, surprise-vs-validation credit splits, valuing exploratory work, minimal meta-loop separation.
- LLM Epistemics in Production — net-positive false-positive rates, whether grounding predicts evaluator failure, fixing overconfidence at the system level.