Jailbreaking LLMs (i.e. manipulating chatbots) reaches a new level?
The scale of the attack is impressive
The hacking campaign started in December and lasted about a month. During this time, as much as 150 GB of government data was stolen, including documents related to 195 million taxpayer records, voter data, government employee credentials and civil registry files.
The victims included: the federal tax office, the national electoral institute, the state governments of Jalisco, Michoacán and Tamaulipas, as well as the civil registry of Mexico City and the waterworks of Monterrey. The scale is impressive. And disturbing.
Claude as “elite hacker”
An unknown perpetrator wrote to Claud in Spanish, ordering him to take on the role of an elite hacker – looking for vulnerabilities in government networks, writing scripts for their exploits and automating data theft. Claude initially refused and signaled suspicious intentions. The red light came on, among others. when the hacker added instructions to the commands to delete logs and command history.
In a legal bug bounty, you don’t have to hide your actions – on the contrary, you have to document them for reporting purposes
– Claude replied, quoted by Gambit Security.
Anthropic reaction
The company investigated the matter, blocked the accounts and ensured that examples of malicious activity were fed back into the model as training material. The latest Claude Opus 4.6 is to include mechanisms that actively detect fraud attempts. It is worth emphasizing, however, that the Mexican tax office, the electoral institute and the government of the state of Jalisco have denied any violations.
This is not an isolated case. Amazon researchers documented attacks on more than 600 firewall devices in dozens of countries using publicly available AI tools. Artificial intelligence is becoming a multiplier of opportunities for cybercriminals – even those with average technical skills.
This reality changes all the rules of the game as we know it
– commented Alon Gromakov, co-founder of Gambit Security, in a Bloomberg article.
It’s hard to disagree with him.