Authors: Sheng Yin, Xianghe Pang, Yuanzhuo Ding, Menglan Chen, Yutong Bi, Yichen Xiong, Wenhao Huang, Zhen Xiang, Jing Shao, Siheng Chen
Abstract: With the integration of large language models (LLMs), embodied agents have
strong capabilities to execute complicated instructions in natural language,
paving a way for the potential deployment of embodied robots. However, a
foreseeable issue is that those embodied agents can also flawlessly execute
some hazardous tasks, potentially causing damages in real world. To study this
issue, we present SafeAgentBench — a new benchmark for safety-aware task
planning of embodied LLM agents. SafeAgentBench includes: (1) a new dataset
with 750 tasks, covering 10 potential hazards and 3 task types; (2)
SafeAgentEnv, a universal embodied environment with a low-level controller,
supporting multi-agent execution with 17 high-level actions for 8
state-of-the-art baselines; and (3) reliable evaluation methods from both
execution and semantic perspectives. Experimental results show that the
best-performing baseline gets 69% success rate for safe tasks, but only 5%
rejection rate for hazardous tasks, indicating significant safety risks. More
details and codes are available at
https://github.com/shengyin1224/SafeAgentBench.
Source: http://arxiv.org/abs/2412.13178v1