Unit 42 Analysis on Multi-Agent Applications in Amazon Bedrock Reveals New Attack Surfaces and Risks of Prompt Injection

Summary: Unit 42 of Palo Alto Networks identifies new risks in the multi-entity collaboration of Amazon Bedrock, including prompt injection and unauthorized access to agent instructions.

Unit 42 of Palo Alto Networks published an analysis that examines Amazon Bedrock’s multi-agent system from a red team perspective. The researchers demonstrate how, under certain conditions, an adversary could progress through a chain of attacks by determining the operational mode (Supervisor or Supervisor with Routing), discovering collaborating agents, and executing malicious actions. These attacks include disclosing instructions and schemas for toolsets, as well as invoking them with inputs provided by the attacker.

No vulnerabilities were identified in Amazon Bedrock; however, tests showed that the integrated protection barriers (Guardrails) of Bedrock stopped the attacks when configured correctly. Nevertheless, these findings reaffirm the need to protect systems that rely on large language models (LLM), as they cannot distinguish between developer-defined instructions and adversarial input.

The researchers tested with their own Bedrock agents, limiting themselves to agent logic and application integrations.

Key facts

  • Unit 42 identified risks of prompt injection and unauthorized access in Amazon Bedrock’s multi-entity systems.
  • The researchers demonstrated attacks that allow the disclosure of instructions and toolset schemas to attackers.
  • Amazon Bedrock has no detected vulnerabilities, but its correct configuration of protection barriers stopped the attacks.

Why it matters

These findings are significant for companies using multi-entity AI-based systems, as they identify new vulnerability points and underscore the need for additional security measures.