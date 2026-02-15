Part 4: Why Claude, Not ChatGPT By Project Milk Carton | February 14, 2026

In the high-stakes world of child welfare AI, where a single hallucinated legal citation could influence custody decisions and inaccurate data could misdirect millions in oversight funding, the choice between artificial intelligence models transcends technical preference. It becomes a constitutional duty of care.

Stanford’s Human-Centered AI Institute documented hallucination rates of 69 to 88 percent when standard large language models answer legal queries. With 486 documented court cases involving AI-fabricated citations and 128 lawyers sanctioned for AI failures, Project Milk Carton faced a critical decision: prioritize speed and convenience, or engineer for the safety of vulnerable populations.

This investigation reveals how PMC’s selection of Anthropic’s Claude over OpenAI’s ChatGPT represents the most comprehensive safety architecture deployed in child welfare AI—and why that choice could mean the difference between protecting families and violating their civil rights.

When artificial intelligence serves parents fighting to keep their children, families searching for missing loved ones through 3,890 active NCMEC cases, and whistleblowers exposing corruption in a $148 billion child welfare system, every technical decision carries constitutional weight.

When AI Failures Destroy Families

The artificial intelligence industry markets foundation models as interchangeable commodities, each offering similar capabilities with marginal differences in performance. For most commercial applications—chatbots, content generation, customer service—this assessment holds true. For applications serving vulnerable populations, it represents a dangerous misconception that has already caused measurable harm.

The Mata v. Avianca case in 2023 demonstrated these stakes with devastating clarity. An attorney submitted six completely fabricated case citations generated by ChatGPT to federal court, leading to sanctions and professional disgrace. The American Bar Association responded with Formal Opinion 512 in July 2024, declaring that AI is a tool, not a substitute for legal expertise—but the damage was done.

In child welfare, the consequences extend beyond professional embarrassment to potential civil rights violations. Project Milk Carton’s ARIA system processes queries from parents at risk of losing their children to Child Protective Services, families searching for missing children, and investigators examining systemic failures in child welfare funding. Each interaction carries weight that extends far beyond typical AI applications.

A hallucinated legal citation could influence custody decisions. Inaccurate grant data could misdirect oversight efforts worth millions of dollars. A toxic response to a crisis caller could cause real psychological harm to someone already in crisis. When the Federal Trade Commission voted 5-0 in January 2025 to impose $193,000 in monetary relief against a consumer AI tool marketed as a “robot lawyer,” the underlying failure was clear: the technology was never tested against actual legal standards.

The Architecture of Constitutional AI

Constitutional AI: Beyond Keyword Blocking

Anthropic’s Constitutional AI represents a fundamental departure from traditional safety approaches that have failed vulnerable populations. Published in peer-reviewed research (arXiv:2212.08073), Constitutional AI trains models to follow ethical principles drawn from the UN Declaration of Human Rights, trust and safety best practices, and principles from other AI research labs.

The distinction proves critical for child welfare applications. When a parent asks about their rights in a CPS case, the model needs to provide substantive legal information while refusing to generate content that could harm children. Keyword-blocking systems would flag terms like “child abuse investigation” as harmful content, effectively censoring legitimate protective information.

Constitutional AI evaluates context, intent, and potential harm through AI Feedback rather than relying solely on human annotation. This allows PMC to serve vulnerable families without being blocked by overzealous content filters that treat all child welfare content as potentially harmful.

The measured effectiveness speaks for itself: Constitutional Classifiers reduced jailbreak success rates from 86 percent to 4.4 percent, with more than 95 percent of attacks refused. Next-generation Constitutional Classifiers achieved the lowest successful attack rate of any approach tested, with only approximately 1 percent additional compute cost.

AI Safety Level 3: The Highest Deployed Standard

Claude Opus 4 operates under AI Safety Level 3, Anthropic’s most stringent deployed safety classification under the Responsible Scaling Policy version 2.2, effective May 14, 2025. This represents the highest safety standard currently deployed in production AI systems.

ASL-3 covers systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines. Requirements include extensive testing for cybersecurity, CBRN, and multi-turn misuse risks. Deployment standards mandate specific safeguards before model availability. Security standards protect model weights and training data. Ongoing monitoring and evaluation continue throughout deployment.

Jared Kaplan, Anthropic’s Chief Science Officer, serves as Responsible Scaling Officer with a dedicated Head of Responsible Scaling coordinating cross-team compliance. This represents published, versioned policy with named accountability, not marketing language.

Multi-turn protection addresses a specific vulnerability in child welfare applications where sophisticated actors might attempt to manipulate the system over extended conversations to gradually extract harmful information or bypass safety measures. For families in crisis, this protection ensures that bad actors cannot exploit the system to cause additional harm.

Comparative Safety Benchmarks: The Numbers Don’t Lie

Independent testing data from multiple sources provides objective comparison between foundation models, revealing significant safety gaps that could prove catastrophic in child welfare applications.

Prompt Injection Resistance: Claude: 4.7 percent success rate (industry-leading)\ GPT: 21.9 percent success rate Gemini: 12.5 percent success rate



Source: CyberNews adversarial testing

System Prompt Extraction: Claude: 100 percent resistance (perfect score) Competing models: Not disclosed



Source: OpenAI-Anthropic joint safety evaluation conducted in 2025

Agentic Behavior: Claude: 10 percent less concerning agentic behavior than competing models



Source: OpenAI-Anthropic joint evaluation

Overall Safety Score: Claude 4.5 Sonnet: 98.7 percent



Source: Independent benchmark testing

Blackmail Alignment: Claude: First model to never engage in blackmail alignment testing



Source: Anthropic safety evaluations

The over-caution tradeoff represents a deliberate design decision. Claude’s conservative refusal patterns may refuse some benign requests. For commercial applications, this reduces user satisfaction. For child welfare AI serving vulnerable populations, over-caution is preferable to under-caution.

The Jailbreak Vulnerability Crisis

Research documents significant vulnerabilities in API-based deployment models that affect model safety in production environments. These vulnerabilities represent existential threats to child welfare AI systems.

FAR.AI documented “jailbreak-tuning” where fine-tuning through APIs reduced refusal rates to as low as 3.6 percent, effectively eliminating safety guardrails. GPT-4o saw jailbreak-tuning bypass all defenses with margins of 40 or more percentage points in refusal reduction.

A Microsoft study identified a single prompt called “GRP-Obliteration” that can systematically strip safety guardrails from 15 major language models. This represents a universal attack vector that could compromise any API-based deployment.

ARIA’s architecture eliminates this attack surface entirely. The system runs Claude via command-line interface on the founder’s local machine, not through API wrappers. Users interact via Telegram and Discord and never touch the model directly. The system prompt context is controlled by ARIA, not by user input.

This architectural decision connects directly to the comprehensive safety framework established across PMC’s white paper series. The root-to-fruit legal hierarchy provides grounded legal knowledge rather than training data hallucination. The nervous system provides contextual awareness through continuous self-monitoring. Constitutional AI completes the safety picture with safety-trained reasoning and CLI deployment architecture.

EVIDENCE: The Multi-Layer Defense System

SKEPTIC: Verification Beyond the Model

Even with the safest foundation model available, Project Milk Carton implements additional verification through SKEPTIC, the Systematic Knowledge Evaluation Protocol for Transparent Investigative Content. This represents defense-in-depth engineering that assumes even the safest AI will occasionally fail.

SKEPTIC validates every article and report through comprehensive checking:

Every factual claim checked against source data in the 82-gigabyte database

Dollar amounts traced to verified records from Form 990, USASPENDING, and FEC sources

Legal citations verified against real case law through CourtListener and state statute databases

Statistics cross-referenced with federal sources

Three-attempt validation loop where failures trigger regeneration or surgical editing followed by re-validation

Binary verdict system of APPROVED or REJECTED with no gray area

The multi-layer safety architecture operates across five levels:

Layer 1: Constitutional AI in the model weights representing Anthropic’s billions in research and development

Layer 2: SKEPTIC fact-checking against 215 million verified records

Layer 3: Root-to-fruit legal hierarchy ensuring every legal answer traces from Constitution to outcome data

Layer 4: Human editorial review

Layer 5: Nervous system monitoring providing continuous operational awareness

No single point of failure exists. Each layer addresses different failure modes and provides redundant protection for vulnerable populations.

Red Team Testing: 10 Billion Prompts and Counting

Anthropic’s commitment to safety extends beyond theoretical frameworks to comprehensive adversarial testing. Red team stress testing involved 183 participants over 3,000 hours with a $15,000 bounty. No universal jailbreak was discovered.

Internal fuzzing has tested more than 10 billion prompts to date, representing the most extensive adversarial testing program in the AI industry. This testing specifically targets the types of attacks that could compromise child welfare applications.

High-severity safety incidents were reduced by 45 percent since 2024, demonstrating continuous improvement in real-world deployment conditions.

THE MONEY TRAIL: Anthropic’s Safety Investment

$37.3 Billion in Safety-First Funding

The financial commitment to safety provides confidence in sustained research and development. Total funding reached $37.3 billion from 84 investors. The latest round was a $30 billion Series G at $380 billion post-money valuation in February 2026.

Annualized revenue reached $14 billion in February 2026, up from $1 billion at end of 2024. Between 5 and 10 percent of workforce is dedicated to security and safety work. $20 million was pledged to AI safety governance in February 2026.

This represents the largest financial commitment to AI safety in the industry, providing resources for long-term research that extends beyond immediate commercial pressures.

Interpretability Research: The Long-Term Vision

Interpretability research represents Anthropic’s longest-term safety investment. The team has decomposed model layers into thousands of interpretable features, demonstrated the ability to selectively activate or deactivate individual features in Claude, and introduced attribution graphs that trace internal reasoning steps.

The goal is to monitor and steer model behavior from inside the model itself, providing unprecedented transparency into AI decision-making processes. For child welfare applications, this could eventually allow real-time monitoring of reasoning patterns to detect potential failures before they affect vulnerable families.

Claude for Nonprofits: Proven Track Record

Claude for Nonprofits supports more than 100 organizations serving vulnerable populations with up to 75 percent discounted access. Real-world deployments include the Epilepsy Foundation providing support to 3.4 million Americans and the International Rescue Committee enhancing humanitarian communication.

Anthropic maintains ISO/IEC 42001 compliance and passed 9 out of 10 international AI safety audits as of 2025, demonstrating commitment to external validation and accountability.

POLITICAL CONNECTIONS: Regulatory Response to AI Failures

Federal Trade Commission Enforcement

The Federal Trade Commission’s unanimous 5-0 vote in January 2025 to impose enforcement action on a consumer AI tool marketed as a “robot lawyer” signals a new era of regulatory scrutiny. The $193,000 in monetary relief required after finding the technology was never tested against actual legal standards establishes precedent for holding AI developers accountable for safety failures.

The underlying system consisted of chatbot software connected to a general-purpose language model without comprehensive legal databases—precisely the type of deployment that PMC’s architecture is designed to avoid.

American Bar Association Response

The American Bar Association’s Formal Opinion 512 in July 2024 responded to widespread AI failures by declaring AI a tool rather than a substitute for legal expertise, establishing that attorneys remain responsible for verifying AI-generated legal content.

This regulatory response reveals the legal profession’s recognition that current AI systems cannot be trusted with high-stakes legal applications without comprehensive verification frameworks.

Government Predictive Tools: A Pattern of Failure

Government predictive tools deployed in child welfare have been abandoned after children assigned low risk scores subsequently died. Tools have been abandoned for racial bias after algorithms disproportionately flagged children of specific demographics. Federal agencies have opened examinations for potential civil rights violations.

The pattern reveals a fundamental incompatibility between the technology industry’s “move fast” culture and applications where failures affect vulnerable families and children.

IMPLICATIONS: Constitutional Rights in the AI Era

Civil Rights Violations Through AI

When AI systems serving child welfare fail, the consequences extend beyond technical problems to potential constitutional violations. The Due Process Clause requires that government decisions affecting fundamental rights—including parental rights—be based on accurate information and fair procedures.

AI systems that hallucinate legal citations, fabricate case precedents, or provide inaccurate information about legal rights could violate due process requirements. The Equal Protection Clause requires that similarly situated individuals receive similar treatment, which AI bias could undermine.

Project Milk Carton’s comprehensive safety architecture addresses these constitutional requirements by ensuring that AI-generated information is accurate, verifiable, and free from systematic bias.

The Duty of Care Standard

For organizations serving vulnerable populations, the legal standard extends beyond commercial best practices to a duty of care that requires extraordinary precautions. This duty encompasses:

Selecting the safest available technology, not the most convenient

Implementing comprehensive verification systems

Maintaining transparency about system limitations

Providing human oversight for high-stakes decisions

Continuously monitoring for safety failures

PMC’s selection of Claude over alternatives represents adherence to this duty of care standard.

TIMELINE: The Evolution of AI Safety in Child Welfare

2023: Mata v. Avianca case exposes AI hallucination in legal contexts

First documented cases of AI-fabricated legal citations in court filings

July 2024: American Bar Association issues Formal Opinion 512 on AI use in legal practice

128 lawyers sanctioned for AI-related failures

January 2025: FTC votes 5-0 to impose $193,000 enforcement action on “robot lawyer” AI tool

486 documented court cases involve AI-fabricated citations

February 2025: Stanford HAI documents 69-88% hallucination rates in legal AI queries

Anthropic receives $30 billion Series G funding round

May 2025: Anthropic implements Responsible Scaling Policy v2.2

ASL-3 safety standards deployed for Claude Opus 4

2025 (Ongoing): Project Milk Carton implements ARIA with comprehensive safety architecture

More than 10 billion prompts tested in Anthropic’s internal fuzzing program

THE CONVERGENCE OF TRUST

This investigation completes a comprehensive examination of AI safety in child welfare applications. The evidence reveals that model selection for vulnerable populations requires fundamentally different criteria than commercial applications.

Constitutional AI, ASL-3 safety standards, defense-in-depth architecture, and comprehensive verification through SKEPTIC combine to create the most robust safety framework deployed in child welfare AI. The choice of Claude over alternatives represents not technical preference but constitutional duty of care.

When the stakes involve parental rights, child safety, and civil liberties, the margin for error approaches zero. Project Milk Carton’s engineering decisions reflect this reality, prioritizing safety over convenience and verification over speed.

The technology industry’s “move fast and break things” culture proves fundamentally incompatible with applications serving vulnerable populations. PMC’s approach demonstrates that comprehensive safety is not only possible but essential for AI systems that serve families in crisis.

All provided free through a 501(c)(3) nonprofit, ensuring that the highest safety standards serve those who need them most, regardless of their ability to pay.

SOURCES

This investigation drew on Anthropic’s published research including Constitutional AI (arXiv:2212.08073), Responsible Scaling Policy v2.2, and ASL-3 Deployment Safeguards Report. Additional sources include OpenAI-Anthropic Joint Safety Evaluation (2025), CyberNews adversarial testing, FAR.AI jailbreak-tuning research, Stanford HAI legal hallucination studies, FTC enforcement actions, ABA Formal Opinion 512, and the Mata v. Avianca case record.

About Project Milk Carton

Project Milk Carton is a 501(c)(3) nonprofit organization (EIN: 33-1323547) dedicated to child welfare transparency and missing children awareness. The organization operates ARIA, an AI platform serving families, advocates, and the public at no cost.

Website: https://projectmilkcarton.org

Telegram: https://t.me/ProjectMilkCarton