Python – 迭代與生成器（Iteration & Generators）

主題：`itertools` 模組

簡介

在日常的資料處理、演算法實作或是大型系統的效能優化中，迭代是最常見的操作之一。Python 內建的 for 迴圈已經相當好用，但當需要同時操作多個可疊代物件、產生無窮序列或是執行高效的組合運算時，單純的 for 已顯得力不從心。
itertools 模組正是為了解決這類需求而設計的，它提供了一系列 記憶體友善、延遲求值（lazy evaluation）的工具，讓開發者可以以宣告式的方式組合、過濾、切割甚至產生無限長度的序列。

掌握 itertools 不僅能讓程式碼更簡潔、可讀，也能顯著降低記憶體佔用，對於 初學者想寫出「Pythonic」的程式，以及 中級開發者在處理大資料或演算法時的效能瓶頸，都是必備的利器。

核心概念

itertools 的設計哲學是「使用迭代器而非容器」。所有函式都回傳 迭代器（iterator），因此不會一次性產生完整的列表，而是根據需求逐項產生元素。下面分別介紹幾個最常用的子模組，並給予實作範例。

1. 基本的迭代器生成

函式	功能	範例
`count(start=0, step=1)`	從 `start` 開始不斷遞增，產生無限序列	`python\nfrom itertools import count\nfor i in count(5, 2):\n if i > 15: break\n print(i) # 5, 7, 9, 11, 13, 15\n`
`cycle(iterable)`	無限循環遍歷給定的可疊代物件	`python\nfrom itertools import cycle\ncolors = cycle(['red', 'green', 'blue'])\nfor _ in range(7):\n print(next(colors))\n`
`repeat(object, times=None)`	重複同一個物件 `times` 次，若不指定則無限重複	`python\nfrom itertools import repeat\nlist(repeat('A', 4)) # ['A', 'A', 'A', 'A']\n`

小技巧：使用 islice（下方會介紹）配合無限迭代器，可安全取得有限的子集合。

2. 組合與排列

函式	功能	範例
`chain(*iterables)`	把多個可疊代物件串接成一個序列	`python\nfrom itertools import chain\nnums = chain([1,2], (3,4), '56')\nprint(list(nums)) # [1, 2, 3, 4, '5', '6']\n`
`compress(data, selectors)`	根據布林序列保留 `data` 中對應位置的元素	`python\nfrom itertools import compress\ndata = 'ABCDEF'\nmask = [1,0,1,0,1,0]\nprint(''.join(compress(data, mask))) # ACE\n`
`product(*iterables, repeat=1)`	笛卡爾積（Cartesian product）	`python\nfrom itertools import product\nprint(list(product('AB', range(2))))\n# [('A',0),('A',1),('B',0),('B',1)]\n`
`permutations(iterable, r=None)`	產生所有長度為 `r` 的排列	`python\nfrom itertools import permutations\nprint(list(permutations('ABC', 2)))\n# [('A','B'),('A','C'),('B','A'),('B','C'),('C','A'),('C','B')]\n`
`combinations(iterable, r)`	產生不重複的組合	`python\nfrom itertools import combinations\nprint(list(combinations('ABCD', 3)))\n# [('A','B','C'),('A','B','D'),('A','C','D'),('B','C','D')]\n`
`combinations_with_replacement(iterable, r)`	允許元素重複的組合	`python\nfrom itertools import combinations_with_replacement\nprint(list(combinations_with_replacement('AB', 2)))\n# [('A','A'),('A','B'),('B','B')]\n`

3. 篩選與切割

函式	功能	範例
`filterfalse(predicate, iterable)`	與 `filter` 相反，保留不符合條件的元素	`python\nfrom itertools import filterfalse\nnums = range(10)\nprint(list(filterfalse(lambda x: x%2, nums))) # 偶數 [0,2,4,6,8]\n`
`takewhile(predicate, iterable)`	只要條件為真就持續取元素，遇到第一個 False 即停止	`python\nfrom itertools import takewhile\nprint(list(takewhile(lambda x: x<5, range(10)))) # [0,1,2,3,4]\n`
`dropwhile(predicate, iterable)`	與 `takewhile` 相反，跳過條件為真的部分，之後全部保留	`python\nfrom itertools import dropwhile\nprint(list(dropwhile(lambda x: x<5, range(10)))) # [5,6,7,8,9]\n`
`islice(iterable, start, stop, step=1)`	像切片一樣取得迭代器的子集合，適用於無限序列	`python\nfrom itertools import islice, count\nprint(list(islice(count(0), 2, 10, 2))) # [2,4,6,8]\n`
`starmap(function, iterable)`	把可疊代物件的每個元素（預期為 tuple）展開作為 `function` 的參數	`python\nfrom itertools import starmap\npairs = [(2,3), (4,5), (6,7)]\nprint(list(starmap(pow, pairs))) # [8, 1024, 279936]\n`

4. 效率提升小技巧

以下示範 結合多個工具，完成一個常見的需求：找出 1~1000 中所有符合條件的質數，且只保留前 10 個。

from itertools import islice, count, filterfalse

def is_prime(n):
    if n < 2:
        return False
    # 只需要檢查到 sqrt(n)
    for i in range(2, int(n**0.5)+1):
        if n % i == 0:
            return False
    return True

primes = filterfalse(lambda x: not is_prime(x), count(2))
first_ten = list(islice(primes, 10))
print(first_ten)   # [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

count(2) 產生無限自然數序列。
filterfalse 只保留質數。
islice 把結果切成前 10 個，避免無限迭代耗盡資源。

常見陷阱與最佳實踐

陷阱	說明	建議的做法
忘記迭代器只能消耗一次	多次呼叫同一迭代器會得到空結果。	若需要重用，先 `list()` 或使用 `tee`（`itertools.tee`）產生獨立的迭代器。
無意間產生無限迭代	`count()`、`cycle()`、`repeat()` 若未搭配 `islice` 會導致程式卡住。	永遠用 `islice`、`takewhile` 或其他限制條件把無限序列「截斷」後再使用。
過度使用 `list()` 失去記憶體優勢	把大型迭代器一次性轉成列表會佔用大量記憶體，失去 `itertools` 的好處。	僅在真的需要隨機存取時才轉換；否則保持迭代器狀態，使用 `for` 或 `next()` 逐項處理。
*`chain` 與 `` 解包的混用**	`chain(*list_of_iterables)` 與 `chain.from_iterable(list_of_iterables)` 效能略有差異。	若可接受 Python 3.5+，直接使用 `chain.from_iterable`，可避免一次性建立中間列表。
`combinations`、`permutations` 在大集合上爆炸	輸出數量呈指數增長，容易造成記憶體或時間瓶頸。	先估算組合數量，或使用 `islice` 只取需要的前幾個結果。

最佳實踐：

懶惰求值：盡量保留迭代器形態，讓資料流在需要時才產生。
組合工具：善用 islice、takewhile、filterfalse 等函式，組成「管線」式的資料處理流程。
可讀性：即使一行可以完成，仍建議拆成多行、加入註解，讓後續維護者容易理解。
性能測試：對於大量資料或高頻呼叫，使用 timeit 或 cProfile 確認 itertools 的效能真的優於手寫迴圈。

實際應用場景

場景	為什麼使用 `itertools`	範例程式
日誌檔案的分批處理	大檔案一次讀入會耗盡記憶體，使用 `islice` 每次取 10,000 行。	`python\nfrom itertools import islice\n\ndef batch_read(path, size=10000):\n with open(path, encoding='utf-8') as f:\n while True:\n batch = list(islice(f, size))\n if not batch:\n break\n yield batch\n`
生成所有可能的密碼組合（小型字典攻擊）	`product` 可快速產生字元的笛卡爾積，配合 `join` 產生字串。	`python\nfrom itertools import product\ncharset = 'abc123'\nfor pwd in map(''.join, product(charset, repeat=4)):\n check(pwd) # 假設有檢查函式\n`
統計資料的滑動視窗（Moving Window）	`tee` 與 `islice` 可同時維持多個指標，實作 O(N) 的滑動平均。	`python\nfrom itertools import tee, islice\n\ndef moving_average(iterable, n):\n it1, it2 = tee(iterable)\n window = list(islice(it1, n))\n sum_win = sum(window)\n yield sum_win / n\n for x in it2:\n sum_win += x - window.pop(0)\n window.append(x)\n yield sum_win / n\n`
資料庫查詢結果的分頁	把資料庫的 cursor（本身是迭代器）與 `islice` 結合，可直接產出第 N 頁。	`python\nfrom itertools import islice\n\ndef paginate(cursor, page, per_page):\n start = (page-1)*per_page\n end = start + per_page\n return list(islice(cursor, start, end))\n`
多條件過濾	`filterfalse` + `compress` 把布林遮罩與資料串聯，寫出簡潔的篩選邏輯。	`python\nfrom itertools import compress\nscores = [78, 85, 62, 90, 55]\npass_mask = [s >= 60 for s in scores]\nprint(list(compress(scores, pass_mask))) # [78,85,62,90]\n`

總結

itertools 是 Python 標準函式庫中最具「魔法」色彩的模組之一。透過 迭代器 的懶惰求值，我們可以：

節省記憶體：無需一次性產生完整序列。
提升可讀性：用宣告式的組合、過濾、切割表達複雜的資料流。
加速開發：大量常見演算法（組合、排列、笛卡爾積）已經被最佳化實作。

在日常開發、資料科學或是系統程式設計中，只要遇到「需要遍歷、組合、過濾」的情境，就應該先考慮 itertools。掌握它的核心概念、常見陷阱與最佳實踐，將使你的 Python 程式碼更 簡潔、效能更佳、且更易於維護。祝你玩得開心，寫出更 Pythonic 的程式！