Claude.ai unavailable and elevated errors on the API
TL;DR Highlight
Anthropic’s entire service suite—Claude.ai, the API, Claude Code—became inaccessible for 1 hour and 18 minutes (17:34–18:52 UTC), sparking outrage among enterprise users over reliability concerns.
Who Should Read
Developers integrating the Claude API or Claude Code into production services, and team leaders grappling with LLM service availability and multi-model strategies.
Core Mechanics
- The outage began at 17:34 UTC on April 28, 2026, and was resolved at 18:52 UTC, lasting a total of 1 hour and 18 minutes. Affected services included claude.ai, Claude Console (platform.claude.com), Claude API (api.anthropic.com), Claude Code, Claude Cowork, and Claude for Government—essentially the entire service portfolio.
- The root cause was identified as an issue related to authentication. A surge in authentication errors occurred in API requests and Claude Code login paths, and claude.ai itself became inaccessible.
- Anthropic announced the investigation at 17:41 UTC, identified the problem at 17:51 UTC, reported work in progress at 18:33 UTC, transitioned to a monitoring phase at 18:59 UTC, and declared final resolution at 19:15 UTC, updating the status page throughout.
- Data shared from status.claude.com indicated that Claude’s uptime had fallen to the ‘one nine’ level—just over 90%—in the last 90 days. This level is widely considered unacceptable for production environments.
- A user from an organization spending over $200,000 monthly on the enterprise tier reported frequent outages in recent months and poor support, leading to anger from leadership. They stated that a ‘one nine’ level of reliability is unacceptable given the cost.
Evidence
- "A user spending over $200,000 monthly on Anthropic’s enterprise tier lamented frequent outages and poor support in recent months, indicating escalating frustration at the executive level and potentially leading to contract re-evaluation."
How to Apply
- If you rely on the Claude API as a single point of failure in production, consider adding automatic fallback logic to alternative models like OpenAI (Codex) or Google (Gemini). This can ensure continued operation during outages like the one experienced.
- Organizations spending tens of thousands of dollars monthly on the Claude API should regularly monitor Anthropic’s status.claude.com and subscribe to email/SMS alerts. Integrating with PagerDuty or Slack webhooks can reduce response times.
- Teams heavily using Claude Code in their workflow should set up alternative coding agents like OpenAI Codex CLI in parallel. This allows work to continue even when Claude Code is unavailable due to authentication issues.
- For teams of around 10 people where AI coding tool costs are a concern or stability is paramount, consider renting GPUs to self-host open models like Qwen or DeepSeek. While initial setup is required, it offers direct control over downtime risk and potential long-term cost savings.
Terminology
Related Papers
Can LLMs model real-world systems in TLA+?
LLM이 TLA+ 명세를 작성할 때 문법은 잘 통과하지만 실제 시스템과의 동작 일치도(conformance)는 46% 수준에 그친다는 걸 체계적으로 검증한 벤치마크 연구로, AI 기반 형식 검증의 현실적 한계를 보여준다.
Natural Language Autoencoders: Turning Claude's Thoughts into Text
Anthropic이 LLM 내부의 숫자 벡터(활성화값)를 직접 읽을 수 있는 자연어로 변환하는 NLA 기법을 공개했다. AI가 실제로 무슨 생각을 하는지 해석하는 interpretability 연구의 새로운 진전이다.
ProgramBench: Can language models rebuild programs from scratch?
LLM이 FFmpeg, SQLite, PHP 인터프리터 같은 실제 소프트웨어를 문서만 보고 처음부터 재구현할 수 있는지 측정하는 새 벤치마크로, 최고 모델도 전체 태스크의 3%만 95% 이상 통과하는 수준에 그쳤다.
MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents
티켓 3장으로 쪼개면 Claude/GPT도 보안 취약점 코드를 53~86% 확률로 그냥 짜준다.
Refusal in Language Models Is Mediated by a Single Direction
Open-source chat models encode safety as a single vector direction, and removing it disables safety fine-tuning.
Show HN: A new benchmark for testing LLMs for deterministic outputs
Structured Output Benchmark assesses LLM JSON handling across seven metrics, revealing performance beyond schema compliance.