Abstract
Recent advancements in decision-making large language model (LLM) agents have
demonstrated impressive performance across various benchmarks. However, these
state-of-the-art approaches typically necessitate internal model fine-tuning,
external model fine-tuning, or policy optimization over a defined state space.
Implementing these methods can prove challenging due to the scarcity of
high-quality training data or the lack of well-defined state space. Moreover,
these agents do not possess certain qualities inherent to human decision-making
processes, specifically the ability to learn from mistakes. Self-reflection
allows humans to efficiently solve novel problems through a process of trial
and error. Building on recent research, we propose Reflexion, an approach that
endows an agent with dynamic memory and self-reflection capabilities to enhance
its existing reasoning trace and task-specific action choice abilities. To
achieve full automation, we introduce a straightforward yet effective heuristic
that enables the agent to pinpoint hallucination instances, avoid repetition in
action sequences, and, in some environments, construct an internal memory map
of the given environment. To assess our approach, we evaluate the agent's
ability to complete decision-making tasks in AlfWorld environments and
knowledge-intensive, search-based question-and-answer tasks in HotPotQA
environments. We observe success rates of 97% and 51%, respectively, and
provide a discussion on the emergent property of self-reflection.
Users
Please
log in to take part in the discussion (add own reviews or comments).