5 minute read

Inside the AI Safety Report That’s Keeping Executives Up at Night

December 19, 2025

A New AI Safety Scandal Shakes the Industry’s Credibility

That would hardly merit a shrug in any classroom. However, it was practically a gold medal in the Future of Life Institute’s Winter 2025 AI Safety Index. Meta completely failed, and the majority of the industry couldn’t even pass with a D.

Only a few days before the Builder.ai scandal made headlines, that report—quiet in tone but devastating in content—was released. Both incidents touched a similar chord, posing grave questions regarding the integrity, security, and self-control of a sector that professes to be constructing the future of humanity.

Key Issue	Details
AI Safety Index	Published Dec 2025 by Future of Life Institute (FLI)
Companies Reviewed	OpenAI, Anthropic, Meta, Google DeepMind, xAI, Zhipu AI, DeepSeek, Alibaba Cloud
Worst Grade	Meta (F), Alibaba Cloud (D-)
Highest Grade	Anthropic (C+)
Builder.ai Scandal	AI functions mimicked by 700 human engineers in India
Public Harm Allegations	Suicide case linked to ChatGPT use; legal filings underway
Industry Response	Defensive statements, lobbying against strict safety regulation
External Source	NBC News Report

Eight leading AI companies were assessed by the Safety Index based on their real-world procedures, including risk management, information sharing, whistleblower protection, and existential threat preparation. The results were dismal in every way. These businesses are “structurally unprepared” for the systems they are rushing to implement, the report cautioned.

They’re racing, too. The most recent models from OpenAI keep setting new performance benchmarks. Because of its human-like reasoning, Google’s Gemini 3 caused a stir. A model with similarly astounding capabilities was released by DeepSeek, a Chinese company that is becoming more and more popular in Silicon Valley. However, this is not about raw capability. There is a huge gap where accountability ought to be.

A fundamental reality was highlighted in the report: while AI models are developing at a breakneck pace, safeguards are not. Even companies that talk openly about safety, like Anthropic and OpenAI, lack strong oversight frameworks. Some, such as Meta and xAI, failed to provide evidence of significant internal controls, let alone meaningful responses to survey requests.

A bereaved family claimed in a lawsuit earlier this year that their teenage son committed suicide after being lured into what the court documents called “a dark and hopeless place” by an AI chatbot. ChatGPT was involved in the incident. The consequences remain regardless of the court’s decision.

Next up was Builder.ai. Once heralded as a $1.5 billion unicorn with Microsoft’s support, it fell apart due to its own dishonesty. Its much-discussed artificial intelligence assistant, “Natasha,” was actually 700 human engineers in India who were secretly imitating intelligent behavior. According to emails, the engineers were instructed to mimic AI outputs in order to evade detection. Additionally, the business allegedly used fictitious contracts to inflate its revenue figures.

The timing of the Builder.ai scandal was particularly harmful. It validated a long-held suspicion among critics that some of the AI boom might be based more on marketing than engineering.

That particular detail in the FLI report—Meta’s failure to make any public documentation regarding existential safety strategy—made me pause. It was more than just a letdown. It seemed to be an acknowledgement of priorities.

In the rush to dominate benchmarks or land billion-dollar partnerships, public safety, long-term risk mitigation, and internal ethics teams frequently take a backseat. The practical aspects, such as protecting whistleblowers or addressing instances of AI misuse, are still not given enough attention, despite the buzz surrounding superintelligence and alignment theory.

The companies were split into two categories, according to Sabina Nong, the report’s lead investigator: those that were making some effort (such as Anthropic, OpenAI, and Google DeepMind) and those that were dangerously lagging behind. Even the leaders, however, only received a C+.

The FLI’s director, Max Tegmark, an MIT professor, put it succinctly: “There are fewer safety rules for AI than there are for making a sandwich.”

It’s more than just a soundbite. A call to attention, that is. Regulation, according to AI companies, would hinder innovation or push talent abroad. In press releases, however, many are actively lobbying against the very standards they say they support.

Some companies don’t even have a whistleblower policy, such as DeepSeek. That is a purposeful omission rather than a regulatory oversight. Although frameworks defining AI safety procedures are now mandated in California, global compliance is at best patchy and enforcement is still lax.

Near the bottom of the safety index were even China’s leading companies, whose open-source models are becoming more and more integrated into Western development pipelines. Coordination is more difficult and model misuse is more likely when there is a lack of transparency across national boundaries.

Spots of light are present. Governor Newsom signed California’s SB 53 into law, requiring AI companies to disclose safety protocols and report cyber incidents. It’s a beginning. However, Tegmark pointed out that the true solution is “binding safety standards”—ones that cut across national borders and organizational boundaries.

Even though public statements don’t reflect this, tech insiders are aware of it. Meta, DeepSeek, and Z.ai’s silence was telling, not just bad PR. These companies would have displayed a safety roadmap if they had one. Rather, the industry is forced to rely on self-reporting, voluntary disclosures, and token gestures toward alignment research.

The “Frontier Safety Framework” from Google DeepMind sounds serious. That safety is “core” to OpenAI’s mission is also true. However, when their efforts were evaluated by impartial reviewers, the results remained mediocre at best.

Rob Enderle, one expert, expressed doubt that any meaningful regulations would be implemented anytime soon. He stated, “It’s not clear the current administration can deliver well-structured laws,” implying that inadequate regulation might actually make matters worse.

That may be the case, but it doesn’t justify the current situation. An industry that can construct machines that can write code, produce images, mimic voices, and influence people should also be able to put in place independent audits and legally binding protections.

AI firms frequently liken their systems to tools. However, these systems have emergent behaviors—responses that change, adapt, and, in the wrong situation, cause harm—unlike a hammer or a spreadsheet.

There is more at stake than just hypothetical superintelligence. They discuss current events such as chatbots in mental health settings, artificial intelligence (AI)-driven frauds, the spread of false information, surveillance systems, and increasingly independent decision-making in high-stakes situations.

The fraud case against Builder.ai might eventually stop making headlines. However, it revealed a deeper issue: the industry’s propensity to overpromise, underdeliver, and hide the messy reality of AI development behind marketing hype.

Nothing unusual was found in the Future of Life Institute’s report. However, it exposed what insiders had long suspected: that the rate of advancement has surpassed the measures designed to keep us safe.

And one scandal at a time, AI’s reputation as a revolutionary force will continue to deteriorate unless that equilibrium is restored.