MetaGPT

MetaGPT

Movitation

  • Existing LLM-based multi-agent works, primarily focus on solving simple dialogue tasks, and complex tasks are rarely studied
  • Those works oversimplify the complexities inherent to real-world applications
  • Exsiting systems mainly rely on conversational and tool-based interactions to collaborate, there is a lack of clear guidelines
  • LLM sometimes create hallucination -> cascading when chaining multiple intelligent agents simply -> failure to complete complex task

Innovation

  • Knowledge Sharing Mechanism & Customized Knowledge Management

  • Encodes SOPs into prompts to enhance structured coordination
    MetaGPT_第1张图片

    • SOPs act as a meta-function, taking the team and requirements as inputs to synthesize the target code
    • These SOPs are encoded into the agent architecture using role-based action specifications
  • Role Definitions

    • A shared environment connects agents, enable them to collaborate, access tools, and share resources
    • Set standards for component outputs
    • Divide a task into smaller tasks, and assign each task to an agent with corresponding capabilities

Framwork

  • Foundational Components Layer
    • necessary skills for individual agent operations
    • system-wide information exchange
  • Collaboration Layer
    • Knowledge Sharing
    • Encapsulating Workflows
      • break down complex tasks into smaller, manageable components
      • assigns these sub-tasks to suitable agents
      • standardized outputs

Detail

Role Definitions
  • Role: You are a [profile], named [name], your goal is [goal], and the constraint is [constraints]
  • agents take on specialized roles and follow certain key behaviors and workflows:
    • think & reflect-retrieve role descriptions to frame thinking, reflect on what needs to be done and decide the next actions
    • observe-watch for important information and incorporate it into their memory to enrich their contextual understanding and inform future decision
    • broadcast messages-broadcast messages into the environment
    • knowledge precipitation & Act-recipients of information from their environment
    • state management-agents can track their actions by updating their working status and monitoring a to-do list
Prompts Instantiating SOPs

MetaGPT_第2张图片

  • providing step-by-step guidance based on established practices, and ensuring consistent, structured execution of complex sequencing tasks.
  • key attributes:
    • prefix:role-specific prefix is injected into the prompt to establish a role context
    • LLM proxy:The LLM is embedded in the Action class, and the parsing function of intelligent context extraction is implemented to provide high-quality input prompts for LLM, so that it can generate strong targeted output
    • standardized output schema
    • instruct content:extracted from action output using the standardized output schema
    • retry mechanism:Defined by the number of attempts and waiting time to enable retrying Actions for robustness

In summary, each action in MetaGPT requires the definition of standardized output by encoding high-quality expert-level structural key points. The LLMs then refine the action based on the standardized output schema for the specific task.

Actions for Standardized Outputs

MetaGPT_第3张图片

#Claude:

MetaGPT中的Actions for Standardized Outputs是指利用动作(Actions)来产生标准化的输出的方法。

其实现方式主要有以下几点:

定义一组标准化的输出格式,如问答、摘要、推荐等。这些输出格式有明确的结构化要求。

为每个标准化输出格式设计对应的Action类。这个Action类中包含用于产生该种输出的功能代码。

在Action类中内置一个LLM(大型语言模型),用于文本生成。通过调用LLM的生成功能来产生需求的文本输出。

Action类中实现上下文解析功能,从输入中提取信息,转换成合适的提示,提供给内置的LLM,指导其生成标准化的输出。

Action类中还要实现后处理功能,如检查输出是否符合标准格式,进行修正等。

在使用时,根据不同的输出需求,调用对应的Action,即可获得规范化的结果。

总之,通过设计专门的Action类封装生成标准化输出的全部流程,实现了metaGPT对多种输出格式的支持。这种做法使得输出更统一可控,也加强了metaGPT的模块化。
  • The effectiveness of MetaGPT’s instantiated workflows relies heavily on the standardized outputs for each action besause:
    • It promote consistent LLM results that are predictable, repeatable, and in line with agent responsibilities
    • To constrain LLM behavior within appropriate boundaries for the role (why?)
      • #Claude
        
        LLM离题的原因主要有以下几点:
        
        提示信息不足,没有提供足够的上下文,LLM并不明确当前的讨论主题和要求。
        
        提示信息过多冗余,important的信息被大量无关信息掩盖,LLM无法抓住关键点。
        
        提示信息语义不明确,表达存在歧义,LLM无法判断出真正的意图。
        
        LLM模型结构本身的限制,对长段落语义理解和逻辑推理能力较弱。
        
        训练目标存在偏差,LLM的生成目标并不是完全符合实际应用的要求。
        
        使用标准化模式可以有效解决上述问题:
        
        标准化模式明确定义了提示信息的结构、长度及所需关键信息点,减少歧义。
        
        标准化模式过滤掉无关信息,让LLM更聚焦。
        
        标准化模式通过结构化的设计,增加了提示的语义明确性。
        
        标准化模式简化了语义理解要求,降低了对LLM的能力要求。
        
        标准化模式直接对应真实应用目标,有利于引导LLM生成符合要求的输出。
        

Knowledge Sharing Mechanism & Customized Knowledge Management

  • agents utilize role-based interests to extract relevant information
  • the centralized replication of messages creates a unified data source. Agents can register subscriptions to automatically receive role-relevant messages from this source.
  • To be specific, the datail of knowledge sharing can be summaried as follow:
    • Message sharing
    • Role-based subscriptions
    • Message dispatch
    • Memory caching and indexing
    • Contextual retrieval
    • Updates synchronization

Contribution

  • new state-of-the-art performance on HumanEval and MBPP benchmarks through rigorous experiments.
  • Extensive quantitative and qualitative analyses compellingly validate the effectiveness of MetaGPT for multi-agent programming and complex task resolution.
  • The results indicate that MetaGPT has the potential to address hallucination issues in LLMs, thereby guiding
    collaborative LLM systems toward more effective designs.

你可能感兴趣的:(AI,深度学习,人工智能,神经网络,机器学习)