Recent developments in large language models and multi-agent systems have shown impressive capabilities in tasks requiring complex reasoning and collaboration. However, existing multi-agent systems rely heavily on manual prompt engineering and intricate frameworks, which limit computational efficiency and learning from data. To address these challenges, the authors propose Chain-of-Agents (CoA), a novel approach that simulates multi-agent collaboration within a single LLM by dynamically activating different tool and role-playing agents in an end-to-end manner. This approach allows native multi-turn problem-solving with multiple agents and tools without the need for external frameworks.
To train these models, the authors introduce a multi-agent distillation framework that converts state-of-the-art multi-agent systems into chain-of-agents trajectories for supervised fine-tuning. They further enhance model capabilities using agentic reinforcement learning on verifiable tasks, resulting in Agent Foundation Models (AFMs). These AFMs demonstrate superior performance across diverse benchmarks in both web agent and code agent domains, establishing new state-of-the-art results. The open-sourcing of model weights, training code, evaluation tools, and datasets provides a comprehensive foundation for ongoing research in agent models and agentic reinforcement learning.
This work has significant implications for the future of LLM-based multi-agent systems, offering a more efficient, scalable, and data-driven alternative to manual prompt engineering. The end-to-end nature of CoA models could streamline complex problem-solving workflows and improve adaptability across various domains. Researchers and practitioners are encouraged to build upon this foundation to explore further enhancements in agent collaboration, learning efficiency, and application breadth.
👉 Pročitaj original: arXiv AI Papers