Papers
Topics
Authors
Recent
Search
2000 character limit reached

AutoManual: Constructing Instruction Manuals by LLM Agents via Interactive Environmental Learning

Published 25 May 2024 in cs.AI and cs.CL | (2405.16247v4)

Abstract: LLMs (LLM) based agents have shown promise in autonomously completing tasks across various domains, e.g., robotics, games, and web navigation. However, these agents typically require elaborate design and expert prompts to solve tasks in specific domains, which limits their adaptability. We introduce AutoManual, a framework enabling LLM agents to autonomously build their understanding through interaction and adapt to new environments. AutoManual categorizes environmental knowledge into diverse rules and optimizes them in an online fashion by two agents: 1) The Planner codes actionable plans based on current rules for interacting with the environment. 2) The Builder updates the rules through a well-structured rule system that facilitates online rule management and essential detail retention. To mitigate hallucinations in managing rules, we introduce a case-conditioned prompting strategy for the Builder. Finally, the Formulator agent compiles these rules into a comprehensive manual. The self-generated manual can not only improve the adaptability but also guide the planning of smaller LLMs while being human-readable. Given only one simple demonstration, AutoManual significantly improves task success rates, achieving 97.4\% with GPT-4-turbo and 86.2\% with GPT-3.5-turbo on ALFWorld benchmark tasks. The code is available at https://github.com/minghchen/automanual.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Do as i can, not as i say: Grounding language in robotic affordances. In Conference on Robot Learning, 2022.
  2. Autoguide: Automated generation and selection of state-aware guidelines for large language model agents. ArXiv, abs/2403.08978, 2024.
  3. Metagpt: Meta programming for multi-agent collaborative framework. ArXiv, abs/2308.00352, 2023.
  4. Rap: Retrieval-augmented planning with contextual memory for multimodal llm agents. ArXiv, abs/2402.03610, 2024.
  5. Language models can solve computer tasks. In Neural Information Processing Systems, 2023.
  6. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. ArXiv, abs/2005.01643, 2020.
  7. Code as policies: Language model programs for embodied control. 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9493–9500, 2022.
  8. Reinforcement learning on web interfaces using workflow-guided exploration. In International Conference on Learning Representations (ICLR), 2018.
  9. Clin: A continually learning language agent for rapid task adaptation and generalization. ArXiv, abs/2310.10134, 2023.
  10. Webgpt: Browser-assisted question-answering with human feedback. ArXiv, abs/2112.09332, 2021.
  11. OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
  12. Training language models to follow instructions with human feedback. In Neural Information Processing Systems, 2022.
  13. Memgpt: Towards llms as operating systems. ArXiv, abs/2310.08560, 2023.
  14. Generative agents: Interactive simulacra of human behavior. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023.
  15. Communicative agents for software development. ArXiv, abs/2307.07924, 2023.
  16. Reflexion: language agents with verbal reinforcement learning. In Neural Information Processing Systems, 2023.
  17. Alfworld: Aligning text and embodied environments for interactive learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2021.
  18. Progprompt: Generating situated robot task plans using large language models. 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11523–11530, 2022.
  19. Llm-planner: Few-shot grounded planning for embodied agents with large language models. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 2986–2997, 2022.
  20. Adaplanner: Adaptive planning from feedback with language models. In Neural Information Processing Systems, 2023.
  21. Chatgpt for robotics: Design principles and model abilities. IEEE Access, 12:55682–55696, 2023.
  22. Chatgpt empowered long-step robot control in various environments: A case application. IEEE Access, 11:95060–95078, 2023.
  23. Voyager: An open-ended embodied agent with large language models. ArXiv, abs/2305.16291, 2023.
  24. Executable code actions elicit better llm agents. ArXiv, abs/2402.01030, 2024.
  25. Describe, explain, plan and select: Interactive planning with large language models enables open-world multi-task agents. In Neural Information Processing Systems, 2023.
  26. Chain of thought prompting elicits reasoning in large language models. In Neural Information Processing Systems, 2022.
  27. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8:229–256, 1992.
  28. The rise and potential of large language model based agents: A survey. ArXiv, abs/2309.07864, 2023.
  29. Tree of thoughts: Deliberate problem solving with large language models. In Neural Information Processing Systems, 2023.
  30. ReAct: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR), 2023.
  31. Agent-pro: Learning to evolve via policy-level reflection and optimization. ArXiv, abs/2402.17574, 2024.
  32. Expel: Llm agents are experiential learners. In AAAI Conference on Artificial Intelligence (AAAI), 2024.
  33. Language agent tree search unifies reasoning acting and planning in language models. ArXiv, abs/2310.04406, 2023.
  34. Ghost in the minecraft: Generally capable agents for open-world environments via large language models with text-based knowledge and memory. ArXiv, abs/2305.17144, 2023.
  35. Large language models can learn rules. ArXiv, abs/2310.07064, 2023.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.