NaviQAte is a three-phase, multi-model methodology for automated web application navigation. It focuses on functionality-guided exploration, integrating multi-modal inputs such as text and images to enhance contextual understanding.
In the Action Planning phase, NaviQAte concretizes abstract functionality descriptions using retrieval-augmented generation, extracts webpage context, and predicts the next step. In the Choice Extraction phase, it preprocesses and ranks actionable elements based on semantic similarity to the predicted next step, and generates contextual descriptions for these elements. In the Decision Making phase, NaviQAte selects the optimal action by combining task history, annotated screenshots, and the ranked actionable elements.
Evaluations on the Mind2Web-Live and Mind2Web-Live-Abstracted datasets show that NaviQAte achieves a 44.23% success rate in user task navigation and a 38.46% success rate in functionality navigation, representing a 15% and 33% improvement over the next-best baseline, WebCanvas. These results demonstrate the effectiveness of NaviQAte's functionality-guided approach in advancing automated web application testing.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문