Google will address AI's resistance to shutdown
With the recent release of Frontier Safety Framework 3.0, Google DeepMind broadened the scope of its AI risk monitoring initiatives to include emerging AI traits that could make human oversight more difficult, such as shutdown resistance and persuasive capacity.
The specifics:
A risk identified in recent external research, the revised framework will monitor if frontier AI thwarts attempts to disable or alter their activities.
Additionally, it will keep an eye out for models that have an abnormally powerful impact on human beliefs and behaviors, which could be harmful in situations with high stakes.
Additionally, DeepMind refined its definitions of Critical Capability Levels to pinpoint critical dangers that demand prompt governance and mitigation actions.
In order to mitigate its risks, CCL will monitor its internal deployments for research and development and do safety checks prior to outward releases.
The action taken by DeepMind highlights a larger trend in which leading AI companies, such as Anthropic and OpenAI, are not just identifying present threats but also strengthening procedures to prepare for potential future events.
The development of really secure superintelligent systems will depend on these efforts as models acquire surprising features.