https://a1026302.hatenablog.com/entry/2025/07/11/125808

both

コード

import pandas as pd

df = pd.DataFrame({
    "_merge": ["both", "left_only", "both", "both"],
    "XXX_ID": [1, 2, 1, 3]
})

# 重複含む "both" の数
print(int((df["_merge"] == "both").sum()))  # → 3

# XXX_ID ベースで重複除外して "both" の数
print(int(df[df["_merge"] == "both"]["XXX_ID"].nunique()))  # → 2（ID=1, 3）

まとめ

式	重複除外	NaN除外	内容
`(df["_merge"] == "both").sum()`	❌	✅	`both` に一致する全行数
`df[df["_merge"] == "both"].drop_duplicates().shape[0]`	✅	✅	`both` に一致する重複なし行数
`df[df["_merge"] == "both"]["XXX_ID"].nunique()`	✅	✅	`both` でのユニークID数

right_only

コード

import pandas as pd

df = pd.DataFrame({
    "_merge": ["right_only", "right_only", "right_only", "both"],
    "XXX_ID": [10, 10, 20, 10]
})

# 重複含む right_only の行数
print(int((df["_merge"] == "right_only").sum()))  # → 3

# 重複除く right_only の行数（全列比較）
print(int(df[df["_merge"] == "right_only"].drop_duplicates().shape[0]))  # → 2

# XXX_IDベースでユニークな right_only の数
print(int(df[df["_merge"] == "right_only"]["XXX_ID"].nunique()))  # → 2（10, 20）

まとめ

コード	重複除外	結果	意味
`(df["_merge"] == "right_only").sum()`	❌	すべての行数	フィルタだけ
`df[df["_merge"] == "right_only"].drop_duplicates().shape[0]`	✅	重複除外した行数	全列で重複判定
`df[df["_merge"] == "right_only"]["XXX_ID"].nunique()`	✅	重複除外したID数	特定カラムでユニーク

right_only

コード

import pandas as pd

df = pd.DataFrame({
    "_merge": ["left_only", "left_only", "left_only", "both"],
    "XXX_ID": [1, 1, 2, 1]
})

# 重複を含む left_only の行数
print(int((df["_merge"] == "left_only").sum()))  
# → 3

# 行全体の重複を除く
print(int(df[df["_merge"] == "left_only"].drop_duplicates().shape[0]))
# → 2

# XXX_IDベースでユニークにカウント
print(int(df[df["_merge"] == "left_only"]["XXX_ID"].nunique()))
# → 2（ID = 1, 2）

まとめ

コード	重複除外	説明
`(df["_merge"] == "left_only").sum()`	❌	フィルタに合致する行数をそのままカウント
`df[df["_merge"] == "left_only"].drop_duplicates().shape[0]`	✅	全列での重複を除外
`df[df["_merge"] == "left_only"]["XXX_ID"].nunique()`	✅	特定列のユニークな値の個数