Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

Ars Technica
By Benj EdwardsMar 14, 2025, 4:03 pm145 pts
In a new paper published Thursday titled "Auditing language models for hidden objectives," Anthropic researchers described how models trained to deliberately conceal certain motives from evaluators could still inadvertently reveal secrets, thanks to their ability to adopt different contextual roles or…

Read Article Share
Share Article
- email
- x.com
- facebook
- pocket
- reddit
- tumblr
- linkedin
- pinterest

Trending Today on Tech News Tube

Rank the 50 best Apple products

The Verge

Rank the 50 best Apple products 127

I tested the Tower Elite 11-Liter Flexi Drawer Air Fryer — it’s the best bang-for-buck large-capacity air fryer around

TechRadar

I tested the Tower Elite 11-Liter Flexi Drawer Air Fryer — it’s the best bang-for-buck large-capacity air fryer around 123

ICYMI: the week's 7 biggest tech stories from the landmark social media addiction trial to more Netflix price hikes

TechRadar

ICYMI: the week's 7 biggest tech stories from the landmark social media addiction trial to more Netflix price hikes 118

Iran war drives urgent need to counter underwater attack drones

The Register

Iran war drives urgent need to counter underwater attack drones 118

OpenAI's US Ad Pilot Exceeds $100 Million In Annualized Revenue In Six Weeks

Slashdot

OpenAI's US Ad Pilot Exceeds $100 Million In Annualized Revenue In Six Weeks 118

Wheel-E Podcast: New Super73s, Heybike Villain, more

Electrek

Wheel-E Podcast: New Super73s, Heybike Villain, more 112

Meta's next AI glasses are reportedly designed with prescription lenses in mind

Engadget

Meta's next AI glasses are reportedly designed with prescription lenses in mind 109

Apple's 50th Anniversary Events Continue in Washington D.C., Shanghai, Tokyo, and Beyond

MacRumors

Apple's 50th Anniversary Events Continue in Washington D.C., Shanghai, Tokyo, and Beyond 109

About Tech News Tube

Tech News Tube is a real time news feed of the latest technology news headlines.

Follow all of the top tech sites in one place, on the web or your mobile device.

Featured

Meta to fund seven new natural gas power…

Meta to fund seven new natural gas power plants to fuel AI data centers —…

How I built an AI operating system to…

How I built an AI operating system to run my publishing company

Last call: Exclusive NordVPN 4-month…

Last call: Exclusive NordVPN 4-month bonus + Amazon gift card ends today

Dozens of early Easter deals are live…

Dozens of early Easter deals are live this weekend at Amazon UK — I've picked…

Enough chit-chat: Alexa is done waiting…

Enough chit-chat: Alexa is done waiting for the agentic AI future — I spoke to…

UK Startup Ignites Plasma Inside Nuclear…

UK Startup Ignites Plasma Inside Nuclear Fusion Rocket

Tesla drives drunk owner while he naps,…

Tesla drives drunk owner while he naps, Police still arrest him on DUI

Instacart Promo Code: Save on Groceries…

Instacart Promo Code: Save on Groceries in March 2026

Something Very Bad is Going to Happen…

Something Very Bad is Going to Happen review — this binge-worthy Netflix…

AV1's Open, Royalty-Free Promise In…

AV1's Open, Royalty-Free Promise In Question As Dolby Sues Snapchat Over Codec