At the same time, the risk is immediate and present with agents. When models are not just contained boxes but can take actions in the world, when they have end-effectors that let them manipulate the world, I think it really becomes much more of a problem.
We are making progress here, developing much better [defensive] techniques, but if you break the underlying model, you basically have the equivalent to a buffer overflow [a common way to hack software]. Your agent can be exploited by third…
Article Source
https://www.wired.com/story/zico-kolter-ai-agents-game-theory/