Claude helped the hacker? The case of the attack on Mexican government institutions is controversial

Artificial intelligence is increasingly appearing as an unwitting accomplice of cybercriminals. The latest example? Chatbot Claude from Anthropic, which – according to the Israeli cybersecurity company Gambit Security – was used by an anonymous hacker to carry out a series of attacks on Mexican government institutions.

Jailbreaking LLMs (i.e. manipulating chatbots) reaches a new level?

The scale of the attack is impressive

The hacking campaign started in December and lasted about a month. During this time, as much as 150 GB of government data was stolen, including documents related to 195 million taxpayer records, voter data, government employee credentials and civil registry files.

The victims included: the federal tax office, the national electoral institute, the state governments of Jalisco, Michoacán and Tamaulipas, as well as the civil registry of Mexico City and the waterworks of Monterrey. The scale is impressive. And disturbing.

Claude as “elite hacker”

An unknown perpetrator wrote to Claud in Spanish, ordering him to take on the role of an elite hacker – looking for vulnerabilities in government networks, writing scripts for their exploits and automating data theft. Claude initially refused and signaled suspicious intentions. The red light came on, among others. when the hacker added instructions to the commands to delete logs and command history.

In a legal bug bounty, you don’t have to hide your actions – on the contrary, you have to document them for reporting purposes

– Claude replied, quoted by Gambit Security.

However, the hacker changed his tactics: instead of conducting a dialogue with the AI chatbot, he provided it with a detailed operating manual. This was enough for the so-called jailbreaking – bypassing the chatbot’s security measures set by its creators. From that moment on, Claude executed thousands of commands on government networks and generated detailed reports indicating subsequent targets and access data.

Anthropic reaction

The company investigated the matter, blocked the accounts and ensured that examples of malicious activity were fed back into the model as training material. The latest Claude Opus 4.6 is to include mechanisms that actively detect fraud attempts. It is worth emphasizing, however, that the Mexican tax office, the electoral institute and the government of the state of Jalisco have denied any violations.

This is not an isolated case. Amazon researchers documented attacks on more than 600 firewall devices in dozens of countries using publicly available AI tools. Artificial intelligence is becoming a multiplier of opportunities for cybercriminals – even those with average technical skills.

This reality changes all the rules of the game as we know it

– commented Alon Gromakov, co-founder of Gambit Security, in a Bloomberg article.

It’s hard to disagree with him.