LangChain 進階主題：LLM Orchestration Patterns

簡介

隨著大語言模型（LLM）在自然語言理解與生成上的表現日益卓越，單一模型已難以完整滿足複雜商業需求。LLM Orchestration（模型編排） 透過把多個 LLM、工具、資料來源以不同模式組合，讓系統能在同一個工作流程中完成「判斷 → 呼叫工具 → 產生回覆」等多步驟任務。

在 LangChain 中，我們可以用 Chain、Agent、Router 等抽象層，快速構建這些編排模式，而不必自行管理 prompt、呼叫順序與錯誤處理。掌握常見的編排模式，能讓開發者把 LLM 的能力延伸至客服、文件搜尋、決策支援等實務場景，提升系統的可維護性與彈性。

本篇文章將從核心概念切入，示範 5 種常見的 Orchestration Patterns，並說明在實作時容易踩到的陷阱與最佳實踐，讓你能在自己的專案中快速上手。

核心概念

1. Chain（串接）

Chain 是最基礎的編排單位，將多個 LLM 呼叫或工具呼叫依序串起來。每一步的輸出會自動成為下一步的輸入，形成 pipeline。

1.1 典型範例：SequentialChain

// npm install langchain openai
import { OpenAI } from "langchain/llms/openai";
import { PromptTemplate } from "langchain/prompts";
import { LLMChain, SequentialChain } from "langchain/chains";

// 1. 建立兩個 LLMChain
const summaryPrompt = PromptTemplate.fromTemplate(
  "請幫我把以下文字做摘要：\n{input}"
);
const translatePrompt = PromptTemplate.fromTemplate(
  "把以下中文摘要翻成英文：\n{summary}"
);

const llm = new OpenAI({ temperature: 0 });

const summaryChain = new LLMChain({
  llm,
  prompt: summaryPrompt,
  outputKey: "summary",
});

const translateChain = new LLMChain({
  llm,
  prompt: translatePrompt,
  outputKey: "translation",
});

// 2. 用 SequentialChain 把兩個 chain 串起來
const pipeline = new SequentialChain({
  chains: [summaryChain, translateChain],
  inputVariables: ["input"],   // 最外層接受的變數
  outputVariables: ["summary", "translation"],
});

// 3. 執行
const result = await pipeline.run({
  input: "LangChain 是一個用於建構 LLM 應用的框架..."
});
console.log(result);
/*
{
  summary: "...",
  translation: "..."
}
*/

重點：SequentialChain 只負責「先後順序」；若需要平行執行或條件分支，請參考下方的 ParallelChain 與 RouterChain。

2. ParallelChain（平行執行）

在某些情況下，我們希望同時向多個 LLM 請求不同的資訊，最後再彙整結果。ParallelChain 允許多條支路同時執行，減少總耗時。

import { ParallelChain } from "langchain/chains";

const sentimentPrompt = PromptTemplate.fromTemplate(
  "請判斷以下文字的情緒（正向／負向）：\n{input}"
);
const keywordPrompt = PromptTemplate.fromTemplate(
  "列出上述文字的 5 個關鍵字：\n{input}"
);

const sentimentChain = new LLMChain({
  llm,
  prompt: sentimentPrompt,
  outputKey: "sentiment",
});
const keywordChain = new LLMChain({
  llm,
  prompt: keywordPrompt,
  outputKey: "keywords",
});

const parallel = new ParallelChain({
  chains: [sentimentChain, keywordChain],
  inputVariables: ["input"],
  outputVariables: ["sentiment", "keywords"],
});

const res = await parallel.run({ input: "這本書寫得非常精彩，我愛死它了！" });
console.log(res);
/*
{
  sentiment: "正向",
  keywords: "書,精彩,愛,閱讀,作者"
}
*/

技巧：若每條支路耗時差距很大，可在 llm 設定不同的 maxTokens 或 temperature，讓耗時較長的支路先行返回，提升使用者體驗。

3. RouterChain（條件路由）

RouterChain 讓我們根據 LLM 的判斷結果，將請求路由至不同的子 Chain。常見於「客服分流」或「多模型選擇」的情境。

import { RouterChain } from "langchain/chains";
import { LLMRouter } from "langchain/chains/router";

// 1. 定義路由判斷的 prompt
const routePrompt = PromptTemplate.fromTemplate(
  `以下是使用者的問題，請根據內容回傳一個關鍵字 (tech / finance / general)：
  {input}`
);
const router = new LLMRouter({
  llm,
  prompt: routePrompt,
  outputKey: "route",
});

// 2. 為每個路由建立專屬 Chain
const techChain = new LLMChain({
  llm,
  prompt: PromptTemplate.fromTemplate("Tech 問題的解答：\n{input}"),
  outputKey: "tech_answer",
});
const financeChain = new LLMChain({
  llm,
  prompt: PromptTemplate.fromTemplate("Finance 問題的解答：\n{input}"),
  outputKey: "finance_answer",
});
const generalChain = new LLMChain({
  llm,
  prompt: PromptTemplate.fromTemplate("一般問題的解答：\n{input}"),
  outputKey: "general_answer",
});

const routerChain = new RouterChain({
  router,
  routes: {
    tech: techChain,
    finance: financeChain,
    general: generalChain,
  },
});

const answer = await routerChain.run({
  input: "請問什麼是期權的時間價值？"
});
console.log(answer);
/*
{
  route: "finance",
  finance_answer: "期權的時間價值..."
}
*/

注意：路由判斷本身也是一次 LLM 呼叫，請確保 temperature 設為較低值（如 0）以提升決策的一致性。

4. Agent（工具驅動的自動化）

Agent 能在執行過程中動態呼叫外部工具（API、資料庫、搜尋引擎），並根據工具回傳的結果再次生成 Prompt，形成「思考 → 行動 → 觀察」的迴圈。

以下示範 OpenAI Functions Agent（LangChainJS 內建的 OpenAIFunctionsAgent），結合天氣 API 與簡易計算器。

import { OpenAIFunctionsAgent } from "langchain/agents";
import { initializeAgentExecutorWithOptions } from "langchain/agents";
import { ChatOpenAI } from "langchain/chat_models/openai";

// 1. 定義可供 Agent 呼叫的工具
const weatherTool = {
  name: "get_weather",
  description: "取得指定城市的即時天氣資訊",
  parameters: {
    type: "object",
    properties: {
      city: { type: "string", description: "城市名稱" },
    },
    required: ["city"],
  },
  func: async ({ city }) => {
    // 假設有一個外部天氣 API
    const resp = await fetch(`https://api.example.com/weather?city=${city}`);
    const data = await resp.json();
    return `目前 ${city} 的天氣是 ${data.temp}°C，${data.description}`;
  },
};

const calculatorTool = {
  name: "calculate",
  description: "執行簡易的數學運算",
  parameters: {
    type: "object",
    properties: {
      expression: {
        type: "string",
        description: "例如 2+3*4，僅允許 + - * / ( )",
      },
    },
    required: ["expression"],
  },
  func: async ({ expression }) => {
    // 使用安全的 eval 套件
    const { evaluate } = await import("mathjs");
    return evaluate(expression).toString();
  },
};

// 2. 建立 Agent
const model = new ChatOpenAI({ temperature: 0 });
const executor = await initializeAgentExecutorWithOptions(
  [weatherTool, calculatorTool],
  model,
  {
    agentType: "openai-functions",
    verbose: true,
  }
);

// 3. 測試
const userQuery = "明天台北會下雨嗎？如果會，我要把 15 除以 3 的結果告訴我。";
const result = await executor.invoke({ input: userQuery });
console.log(result.output);
/*
> 明天台北的天氣預報顯示有雨，氣溫 22°C。  
> 15 / 3 = 5
*/

關鍵點：

每個工具必須以 OpenAI Functions 規範 定義 name、description、parameters。
Agent 會根據 LLM 的「思考」自行決定是否呼叫工具，無需額外的條件判斷程式碼。

5. Conversational Retrieval Chain（檢索增強對話）

在需要結合向量資料庫（如 Pinecone、Chroma）與 LLM 產生回答的情境下，我們會用 RetrievalQAChain 或 ConversationalRetrievalChain。此模式的編排流程大致為：使用者問題 → 向量檢索 → 取得相關文件 → LLM 產生答案。

import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { PineconeStore } from "langchain/vectorstores/pinecone";
import { ConversationalRetrievalQAChain } from "langchain/chains";
import { ChatOpenAI } from "langchain/chat_models/openai";
import { BufferMemory } from "langchain/memory";

// 1. 初始化向量資料庫（假設已經有文件寫入）
const embeddings = new OpenAIEmbeddings();
const pinecone = new PineconeStore({
  indexName: "knowledge-base",
  namespace: "product-manual",
  embeddings,
});

// 2. 建立記憶體，使對話具備上下文
const memory = new BufferMemory({
  memoryKey: "chat_history",
  returnMessages: true,
});

// 3. 建構 Conversational Retrieval Chain
const qaChain = ConversationalRetrievalQAChain.fromLLM(
  new ChatOpenAI({ temperature: 0 }),
  pinecone.asRetriever(4), // 每次取 4 個最相關文件
  {
    memory,
    returnSourceDocuments: true,
  }
);

// 4. 互動示範
let response = await qaChain.invoke({ question: "這支手機的防水等級是多少？" });
console.log(response.answer);
console.log("引用文件:", response.sourceDocuments.map(d => d.metadata.source));

response = await qaChain.invoke({ question: "那它支援的最大記憶體容量呢？" });
console.log(response.answer); // 會自動帶入前一輪的聊天歷史

實務建議：在檢索階段加入 過濾條件（如時間、語言）或分頁，可避免一次返回過多文件導致 LLM 輸入長度超限。

常見陷阱與最佳實踐

陷阱	可能的影響	建議的最佳實踐
溢位（Token Overflow）	LLM 輸入超過上限，導致錯誤或截斷	- 在 `Chain` 前使用 `trim`、`summarize`； - 針對檢索結果使用 `maxTokens` 限制； - 盡量使用 `ChatOpenAI` 的 `maxTokens` 參數。
溫度設置過高	產出不一致，路由判斷失敗	- 路由、工具呼叫等決策性步驟 temperature = 0； - 僅在最終生成階段略微提升溫度。
工具參數驗證不足	產生不合法的 API 呼叫，甚至安全風險	- 使用 JSON Schema 定義 `parameters`，LangChain 會自動驗證； - 在工具實作中加入白名單或正則驗證。
記憶體（Memory）無法持久化	多輪對話失去上下文	- 針對長期服務，將 `BufferMemory` 序列化至 Redis、PostgreSQL； - 使用 `ConversationBufferWindowMemory` 限制窗口大小，降低記憶體佔用。
過度平行化	產生大量不必要的 API 呼叫，成本飆升	- 只對真正獨立的子任務使用 `ParallelChain`； - 先評估每條支路的必要性，使用條件路由取代平行。
錯誤處理忽略	當工具失敗時整個 Chain 中斷	- 為每個工具加入 `try/catch` 包裝； - 使用 `ConditionalChain` 或自訂 `fallback` 回傳預設訊息。

額外技巧

使用 verbose: true：在開發階段開啟詳細日誌，可即時看到每一步的輸入/輸出，快速定位問題。
封裝常用 Prompt：將常見的 Prompt 以 PromptTemplate.fromTemplate 方式抽離，方便重用與版本管理。
環境變數管理：LLM 金鑰、向量資料庫憑證請統一放在 .env，並使用 dotenv 載入，避免硬編碼。

實際應用場景

場景	使用的 Orchestration Pattern	為何適合
客服機器人	RouterChain + Agent（呼叫訂單查詢 API）	根據問題類型分流，僅在需要時呼叫實際後端服務，降低延遲。
文件搜尋 + 摘要	Conversational Retrieval Chain + SequentialChain（摘要 → 翻譯）	先檢索相關段落，再產生摘要，最後交給翻譯 chain，提供多語言支援。
金融資訊分析	ParallelChain（情緒分析、關鍵字抽取） + RouterChain（根據情緒決策）	同時取得情緒與關鍵字，根據情緒正負分流至不同的投資建議模型。
自動化報表產生	Agent（呼叫資料庫、計算） + SequentialChain（整理結果 → 產生 Markdown）	Agent 動態查詢資料庫、計算指標，最後交給 LLM 產出可讀的報表文字。
教育輔助系統	RouterChain（科目分流） + Retrieval QA（教材檢索） + Agent（程式碼執行）	依科目路由至不同知識庫，若需要程式碼示範則由 Agent 執行 sandbox 環境的程式。

總結

LLM Orchestration 是將大語言模型與外部工具、資料庫、以及不同模型協同工作的核心技術。透過 Chain、ParallelChain、RouterChain、Agent、Conversational Retrieval Chain 等模式，開發者可以在 保持程式碼可讀、易維護 的同時，快速打造功能豐富、具備多輪上下文的 AI 應用。

在實作時，務必注意 Token 管理、溫度設定、工具參數驗證，並善用 記憶體與日誌 來提升系統穩定性。只要掌握這些編排模式與最佳實踐，你就能把 LangChain 的威力發揮到最大，為各行各業的智慧化需求提供彈性且可靠的解決方案。

祝開發順利，期待看到你用 LangChain 打造的下一代 AI 產品！ 🚀