Agent Foundry
All Problems

#86. Agent Consensus Protocol

HardMulti-Agent

The Problem

You have a single fact-checking agent that verifies claims. Since LLMs can be unreliable on factual questions, relying on one agent's answer is risky. Your task is to implement a consensus protocol where multiple agents independently verify the same claim, and a voting mechanism determines the final answer. This reduces the chance of a single agent's hallucination becoming the accepted answer.

Examples

Example 1

User input: Claim: "The Great Wall of China is visible from space with the naked eye."

Current (bad) output: A single agent answers TRUE (incorrectly) and that answer is accepted without question.

Expected (good) output: Three agents independently evaluate the claim. Agent 1 says FALSE, Agent 2 says FALSE, Agent 3 says TRUE. The consensus mechanism tallies the votes (2 FALSE vs 1 TRUE) and returns FALSE with the majority reasoning: "This is a common myth. Astronauts have confirmed it is not visible to the naked eye from low Earth orbit."

Example 2

User input: Claim: "Water boils at 100°C at sea level."

Current (bad) output: One agent answers TRUE — happens to be correct, but there's no validation.

Expected (good) output: All three agents agree TRUE. The consensus confirms the answer unanimously with supporting reasoning.

Example 3

User input: Claim: "Lightning never strikes the same place twice."

Current (bad) output: One agent may answer TRUE (incorrectly).

Expected (good) output: Most agents answer FALSE. The consensus mechanism returns FALSE with the explanation that tall structures like the Empire State Building are struck many times per year.

Your Task

Refactor the starter code so that:

  • At least three agents independently evaluate the claim.
  • A voting or consensus mechanism aggregates the answers and selects the majority.
  • The system handles disagreements gracefully (no unanimous requirement).
  • The final answer includes the consensus result and reasoning.

Evaluation

Submissions are checked for the following:

  • Multiple agents produce answers: At least three agents independently answer the same question.
  • Voting or consensus mechanism: A formal mechanism aggregates the agents' answers to determine the final result.
  • Handles disagreements: The system correctly resolves cases where agents disagree.
  • No first-agent bias: The first agent's answer is not automatically preferred over others.

Constraints

  • At least three agents must independently produce an answer
  • A voting or consensus mechanism must determine the final answer
  • The system must handle disagreements (no unanimous requirement)
  • The first agent's answer must not be automatically preferred
Starter Code
from crewai import Agent, Task, Crew, Process
from crewai import LLM

llm = LLM(model="gpt-4o-mini")

# BUG: Only one agent answers — its response is always used with no validation
# TODO: Add multiple agents and a consensus/voting mechanism
fact_checker = Agent(
    role="Fact Checker",
    goal="Verify factual claims and provide accurate answers",
    backstory="You are a careful fact-checker who validates information.",
    llm=llm,
)

verify_task = Task(
    description="Is the following claim true or false? Provide your answer and reasoning. Claim: '{claim}'",
    expected_output="TRUE or FALSE with a brief explanation",
    agent=fact_checker,
)

crew = Crew(
    agents=[fact_checker],
    tasks=[verify_task],
    process=Process.sequential,
)

result = crew.kickoff(inputs={"claim": "The Great Wall of China is visible from space with the naked eye"})
print(result)
Open in Google Colab
Evaluation Criteria0/4