人工智能智能体

手搓 Claude Code（二）：给 Agent 装上任务规划能力

上一篇咱搓了一个能跑命令、读写文件的最简 agent。它能干活，但干活的方式有点"愣"——用户说一句，它做一步，做完就忘了自己在干嘛。

碰到简单任务还好。一旦任务稍微复杂点，比如"帮我创建一个 Flask 项目，包含用户注册和登录功能"，问题就来了：模型可能写完注册忘了登录，或者改着改着把前面的工作覆盖了。

原因很简单：它没有计划。

在使用 Claude Code 时，它在处理复杂任务时会先列一个 todo list，然后逐项推进，做完一项打个勾。这不是装样子，这是让 agent 在多步任务中保持方向感的关键机制。

这篇文章就来实现这个能力。

问题出在哪

回顾上一篇的 agent 循环：

用户输入 → 模型思考 → 调用工具 → 拿到结果 → 继续思考 → ……

循环本身没问题，但模型的"思考"完全依赖消息历史。随着对话越来越长，早期的意图会被稀释。模型在第 8 次工具调用时，可能已经记不清用户最初到底要什么了。

人类程序员怎么处理这种情况？列个清单，做完一项划掉一项。agent 也可以这么干。

设计一个 TodoPlanManager

任务管理器不需要多复杂。一个任务有三个字段就够了：

class TodoItem(TypedDict, total=False):
    id: str
    text: str
    status: Literal["pending", "running", "completed"]

三种状态：pending 等待中，running 进行中，completed 已完成。

管理器本身就是一个列表加几个约束：

class TodoPlanManager:
    def __init__(self):
        self.items = []

    def update(self, items: list[TodoItem]) -> str:
        validated = self._validate(items)
        self.items = validated
        return self.render()

    def render(self) -> str:
        if not self.items:
            return "No todos."
        lines = []
        for item in self.items:
            marker = {"pending": "[ ]", "running": "[>]", "completed": "[√]"}[item["status"]]
            lines.append(f"{marker} #{item['id']}: {item['text']}")
        done = sum(1 for t in self.items if t["status"] == "completed")
        lines.append(f"\n({done}/{len(self.items)} completed)")
        return "\n".join(lines)

update 接收完整的任务列表（不是增量更新），校验后替换当前状态，然后返回渲染结果。render 把任务列表格式化成人能看懂的文本。

为什么用全量替换而不是增量操作（比如 add/remove/update 三个接口）？因为对模型来说，一次性给出完整列表比记住"第 3 项改成 completed"更不容易出错。模型擅长生成结构化数据，不擅长记住之前的状态。

两个约束

管理器内部有两个校验规则。

第一，任务总数不能超过 20 条：

MAX_ITEMS = 20

if len(items) > MAX_ITEMS:
    raise ValueError(f"Maximum support for {MAX_ITEMS} pending tasks")

这不是随便定的数字。任务列表会作为工具调用的参数和返回值出现在消息历史里，太长会占用大量 token，影响模型对其他内容的注意力。

第二，同一时间只能有一个任务处于 running 状态：

def _check_in_progress(self, items: list[dict]) -> None:
    running_list = [item for item in items if item["status"] == "running"]
    if len(running_list) > 1:
        ids = [item["id"] for item in running_list]
        raise ValueError(f"Only one task can  running at a time, got: {ids}")

这个约束迫使模型串行执行任务。看起来是个限制，实际上是在帮模型——并行处理多个任务对当前的 LLM 来说太容易出错了。

把 TodoPlanManager 变成工具

有了管理器，还需要把它包装成模型能调用的工具：

@tool
def update_todo(items: list[TodoItem]) -> str:
    """Update task list. Track progress on multi-step tasks."""
    return todo_plan_manager.update(items)

然后在 system prompt 里告诉模型怎么用：

SYSTEM_PROMPT = f"""
You are a coding agent, {os.getcwd()} is your working directory, you can use bash tools.
Use a to-do tool to plan multi-step tasks. Mark as in progress before starting,
and mark as completed when finished.
Current runtime system is {platform.system()}
"""

关键是这句：“Use a to-do tool to plan multi-step tasks. Mark as in progress before starting, and mark as completed when finished.” 这不是建议，是指令。没有这句话，模型大概率不会主动去用 todo 工具。

工具注册的小改进

上一篇我们手动维护了一个 tools_dict，每加一个工具就要改两个地方（列表和字典）。这次换个写法：

ALL_TOOLS = [run_bash, run_read_file, run_write_file, run_edit_file, update_todo]
tools_dict = {t.name: t for t in ALL_TOOLS}

llm = llm.bind_tools(ALL_TOOLS)

一个列表搞定，字典自动生成。加新工具只需要往 ALL_TOOLS 里追加。

Agent 循环的变化

核心循环和上一篇几乎一样，只多了一个小改动——当模型调用 update_todo 时，实时把任务列表打印出来：

while response.tool_calls:
    for tool_call in response.tool_calls:
        result = tools_dict[tool_call["name"]].invoke(tool_call["args"])
        messages.append(ToolMessage(
            content=result,
            tool_call_id=tool_call["id"],
        ))
        if tool_call["name"] == "update_todo":
            print(f"\n📋 Todo Plan:\n{todo_plan_manager.render()}")
    response = llm.invoke(messages)
    messages.append(response)

这样用户就能看到 agent 的工作进度，而不是干等着不知道它在干嘛。

实际跑起来是什么样

假设你输入：“帮我创建一个 hello.py，里面写一个 hello world 函数，然后运行它”。

agent 的行为大概是这样的：

📋 Todo Plan:
[>] #1: 创建 hello.py 并编写 hello world 函数
[ ] #2: 运行 hello.py 验证结果

📋 Todo Plan:
[√] #1: 创建 hello.py 并编写 hello world 函数
[>] #2: 运行 hello.py 验证结果

📋 Todo Plan:
[√] #1: 创建 hello.py 并编写 hello world 函数
[√] #2: 运行 hello.py 验证结果

(2/2 completed)

模型先规划出两步，然后逐步执行。每完成一步就更新状态。整个过程用户都能看到进度。

回头看看

对比上一篇的 agent，这次的改动其实很小——就加了一个 TodoPlanManager 类和一个 update_todo 工具。但效果上的差异不小：

上一篇的 agent 像一个只会听指令的执行者，你说一步它做一步。这一篇的 agent 开始有了"规划"的意识，它会先想想要做哪几件事，然后按顺序推进。

当然，这个规划能力完全依赖模型自身的推理。模型如果规划得不好，todo list 也救不了它。但至少，有了这个机制，模型在执行多步任务时不容易迷失方向。

下一篇我们继续往上加东西，比如 Claude Code 是怎么载入 skills 的。

如果觉得文章对你有用，请随意赞赏

人工智能 Agent LLM

手搓 Claude Code（二）：给 Agent 装上任务规划能力

https://blog.likegakki.com/archives/todoplan.html

作者

LOVEGAKKI

发布于

2026-03-01

更新于

2026-03-01

许可协议

CC BY 4.0