LangChain 課程 – 單元：Agents（自主智能體）

主題：Agent 基本概念

簡介

在 LLM（大型語言模型）日益普及的今天，僅靠一次「問答」已難以滿足複雜的商業需求。Agent（自主智能體）讓模型能夠自行決策、選擇工具、迭代執行多步任務，彷彿給 LLM 加上了「思考」與「行動」的能力。

透過 LangChain 的 Agent 抽象，我們可以把 「什麼時候使用什麼工具」 的邏輯封裝起來，讓開發者只需關注任務本身，而不必每次都手動編排工具呼叫。這不僅提升開發效率，也讓系統更具彈性與可維護性。

本篇文章將從概念、實作、常見陷阱到實務應用，完整說明 Agent 的核心運作，幫助初學者快速上手，同時提供中級開發者可直接套用的範例。

核心概念

1. Agent 的組成

元件	說明
LLM (Large Language Model)	提供自然語言的理解與生成能力，負責「思考」與「判斷」
Tool (工具)	具體可執行的功能，如搜尋 API、資料庫查詢、程式碼執行等
Planner / Prompt	給 LLM 的指令模板，告訴它何時、如何呼叫工具
Executor	解析 LLM 回傳的指令，實際呼叫對應的 Tool，並將結果回饋給 LLM

圖解：LLM → (根據 Prompt 產生指令) → Executor → Tool → (結果回傳) → LLM → 最終答案

2. 何謂「自動化決策」

Agent 之所以稱為「自主」是因為它會在 「思考 → 行動 → 觀察」 的迴圈中自行決策。例如：

使用者問「今年台北的天氣趨勢是什麼？」
LLM 判斷需要「天氣 API」才能回答，於是產生呼叫指令。
Executor 執行 API 並把結果回傳給 LLM。
LLM 再根據結果產生最終回覆。

3. LangChain 中的 Agent 類型

類型	特色
Zero-Shot ReAct Agent	直接以「思考 + 行動」的文字格式與 LLM 互動，適合簡單任務。
Conversational React Agent	支援多輪對話，記憶上下文，適合聊天機器人。
Structured Chat Agent	使用結構化輸出（JSON）讓工具呼叫更安全、易於解析。
Tool-Using Agent	可自訂任意工具集合，彈性最高。

以下範例將以 Zero-Shot ReAct Agent 為主，說明如何在 LangChain 中快速建立一個能呼叫搜尋與計算工具的 Agent。

程式碼範例

1. 基本環境設定

# 安裝 LangChain 與 OpenAI 套件
pip install langchain openai

// 若使用 Node.js，安裝相對套件
npm install langchain openai

2. 定義工具（Tool）

from langchain.tools import BaseTool
import requests

class SearchTool(BaseTool):
    """簡易的 Google Search API 包裝"""

    name = "search"
    description = "用於在網路上搜尋資訊，輸入關鍵字即可得到搜尋結果的摘要。"

    def _run(self, query: str) -> str:
        # 這裡僅示範，實際請使用合法的搜尋 API
        response = requests.get(f"https://api.example.com/search?q={query}")
        data = response.json()
        # 只回傳前 3 個結果的標題
        return "\n".join([item["title"] for item in data["results"][:3]])

    async def _arun(self, query: str) -> str:
        raise NotImplementedError("Async not implemented")

class CalculatorTool(BaseTool):
    """簡易的算術計算器"""

    name = "calculator"
    description = "執行基本的算術運算，例如加、減、乘、除。"

    def _run(self, expression: str) -> str:
        try:
            # 使用 Python eval 前先做安全檢查
            allowed = "0123456789+-*/(). "
            if any(c not in allowed for c in expression):
                return "Invalid characters in expression."
            result = eval(expression, {"__builtins__": {}})
            return str(result)
        except Exception as e:
            return f"Error: {e}"

3. 建立 Zero‑Shot ReAct Agent

from langchain.llms import OpenAI
from langchain.agents import initialize_agent, AgentType

# 1️⃣ 建立 LLM（此處使用 OpenAI gpt-3.5-turbo）
llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0)

# 2️⃣ 把剛才的工具放入列表
tools = [SearchTool(), CalculatorTool()]

# 3️⃣ 初始化 Agent
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,          # 顯示思考/行動過程，方便除錯
)

# 4️⃣ 測試一個需要「搜尋」再「計算」的問題
question = "請找出 2023 年台北市人口總數，然後除以 12，算出每月平均新增人口是多少？"
answer = agent.run(question)
print("\n最終答案：", answer)

說明：

initialize_agent 會自動根據每個 Tool 的 description 產生 Prompt，讓 LLM 知道何時使用哪個工具。

verbose=True 會在 console 中印出 LLM 的思考過程（如 Thought:, Action:），有助於了解 Agent 的決策路徑。

4. 使用 Structured Output（JSON）避免文字解析錯誤

from langchain.agents import Tool
from langchain.prompts import PromptTemplate

json_prompt = PromptTemplate.from_template(
    """You are an assistant that must output actions in JSON format.
    The schema is:
    {{
        "action": "<tool name>",
        "action_input": "<the input for the tool>"
    }}
    Respond only with the JSON."""
)

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    prompt=json_prompt,
    verbose=True,
)

question = "請把今天的台北天氣說明成一句話，並計算明天的最高溫度與今天的差值。"
print(agent.run(question))

重點：使用 OPENAI_FUNCTIONS（或 structured）時，LLM 會直接回傳 JSON，減少文字切割與錯誤解析的風險。

5. 多輪對話的 Conversational React Agent

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    memory=memory,
    verbose=True,
)

# 第一次提問
print(agent.run("請列出最近三篇關於自然語言處理的技術部落格文章。"))

# 接著追問
print(agent.run("那篇文章的作者是誰？"))

說明：ConversationBufferMemory 會把先前的對話內容保存下來，讓 Agent 能在後續回合中參考上下文。

常見陷阱與最佳實踐

陷阱	說明	最佳實踐
工具描述過於模糊	LLM 無法正確判斷何時使用哪個工具	為每個 Tool 撰寫清晰、具體的 `description`，並在範例中示範使用方式
LLM 產生非法指令	例如把程式碼直接當作輸入傳給搜尋工具	使用 structured output（JSON）或 OpenAI Functions，限制回傳格式
迴圈無限	Agent 可能因思考/行動不斷循環而卡死	設定 `max_iterations`（如 `agent.run(..., max_iterations=5)`) 或檢查 `stop` 條件
安全性問題	直接 `eval` 使用者輸入會造成代碼注入	僅允許安全字元、或使用 sandbox/第三方計算服務
API 呼叫過於頻繁	每一次思考都可能觸發外部 API，成本飆升	在 Tool 中加入快取（如 `functools.lru_cache`）或節流機制

其他建議

使用 verbose=True 在開發階段觀察 Agent 的思考過程，快速定位錯誤。
把工具封裝成類別，保持單一職責（SRP），方便測試與重用。
結合 Vector Store（如 Pinecone、FAISS）可以讓 Agent 在本地知識庫中搜尋，降低外部 API 依賴。

實際應用場景

場景	需求	Agent 如何協助
客服機器人	需要即時查詢訂單、產品說明、FAQ。	使用 OrderLookupTool、KnowledgeBaseSearchTool，Agent 自動決定是查資料庫還是呼叫外部 API。
商業資訊蒐集	每天自動彙整競爭對手的新聞與財報。	Agent 結合 WebScraperTool、SummarizerTool，完成「搜尋 → 抽取 → 摘要」全流程。
資料分析助理	使用者提供自然語言指令，要求產生圖表或統計報告。	Agent 透過 PandasTool、MatplotlibTool，自動執行程式碼並回傳圖像連結。
程式除錯輔助	開發者輸入錯誤訊息，期待得到修正建議。	Agent 結合 StackOverflowSearchTool、CodeRunnerTool，先搜尋相關解答，再在安全環境執行示範程式碼。
教育平台	學生提問數學或科學問題，系統需給出步驟化解答。	Agent 使用 CalculatorTool、EquationSolverTool，逐步展示推導過程。

總結

Agent 為 LangChain 帶來了 「思考 + 行動」 的循環機制，使得 LLM 能在同一個對話中自動選擇合適工具、執行多步任務，從而解決單純文字生成無法處理的複雜問題。掌握以下要點，即可在專案中快速落地：

明確定義工具（名稱 + 說明），讓 LLM 能正確判斷使用時機。
使用結構化輸出（JSON / OpenAI Functions）避免文字解析錯誤。
加入記憶與迭代限制，防止無限循環與資訊遺失。
遵循安全與成本最佳實踐（快取、節流、輸入驗證）。

透過本文提供的程式碼範例與實務建議，讀者可以在 聊天機器人、資訊蒐集、資料分析等領域 立即構建具備自主決策能力的智能體，為產品帶來更高的自動化與使用者體驗。祝開發順利，期待看到你用 LangChain Agents 打造出更聰明的應用！