reliability

4 items

Bloomberg 2026-03-10-1

Oracle and OpenAI End Plans to Expand Flagship Stargate Data Center

Nvidia paid $150M to a DC developer to ensure its GPUs — not AMD's — fill the expansion, making it an infrastructure intermediary, not just a chip vendor. The deeper signal: OpenAI's "often-changing demand forecasting" suggests even the largest training compute buyer is uncertain about forward requirements, cracking the infinite-linear-scaling thesis. Cooling failures taking buildings offline in winter are the first concrete evidence of operational fragility at hyperscale AI density.

Wall Street Journal 2026-03-09-3

Anthropic's AI Hacked the Firefox Browser. It Found a Lot of Bugs.

The independent credibility piece for Anthropic's security capabilities. Claude found 100+ Firefox bugs (14 high-severity) in two weeks -- more high-severity than the world reports to Mozilla in two months. The Curl counter-narrative is the buried lede: AI bug reports are 95% garbage (Stenberg data), making Claude's hit rate the real differentiator, not the volume. Most important detail: Claude is better at finding bugs than exploiting them -- the defender/attacker asymmetry currently favors defenders, but that gap is temporary.

The Intrinsic Perspective 2026-03-08-1

Bits In, Bits Out

Hoel argues writing is the canary domain for AI capability — 6 years in, LLMs produced efficiency gains and slop, not a quality revolution. The Amazon book data is compelling (average worse, top 100 unchanged), but the extrapolation from writing to all domains is structurally weak: verifiable domains like code and math behave differently from taste-dependent ones. Best articulation of the "tools not intelligence" thesis, but cherry-picks the hardest domain for AI to show measurable ceiling gains.

Wall Street Journal 2026-03-08-3

Can AI Replace Humans for Market Research?

$100M Series A announcement dressed as trend piece. CVS's "95% accuracy" claim is backtested against known answers — the real test is predicting unknown findings, which nobody's shown. Digital twins for market research are a cost/speed optimization, not a new form of intelligence. The hard-to-reach population simulation (chronic disease patients from sparse data) is where overconfidence becomes actively dangerous.