The authors introduce the SafetyDetect dataset, a new dataset aimed at enabling embodied agents to detect unsafe, unsanitary, and dangerous-for-children conditions in home environments. The dataset consists of 1000 anomalous home scenes, each containing various hazards such as spills, tripping hazards, expired produce, and accessible poisons.
The authors propose a method that utilizes large language models (LLMs) like GPT-4 alongside a scene graph representation of the environment. The scene graph encodes the relationships between objects in the scene, which the authors find is crucial for enabling the LLM to reason about the safety and sanitation of the environment.
The authors' method classifies the object relations in the scene graph as either 'normal', 'unsafe', 'unsanitary', or 'unsafe for children'. This classification approach, combined with the use of the scene graph, allows the method to correctly identify over 90% of the anomalous scenarios in the SafetyDetect dataset.
The authors also conduct real-world experiments using a ClearPath TurtleBot, where they generate a scene graph from the visual inputs and run their approach with no modification. This demonstrates the potential for the method to transfer from simulation to the real world with little performance loss.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by James F. Mul... at arxiv.org 04-16-2024
https://arxiv.org/pdf/2404.08827.pdfDeeper Inquiries