https://touch-sp.hatenablog.com/entry/2021/01/09/101539

最終更新：2021年12月23日

AutoGluonを使った学習と推論

学習する

import pandas as pd
from autogluon.core.utils import download
from autogluon.tabular import TabularDataset, TabularPredictor

train_file = download('https://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.data')
df = pd.read_csv(train_file, delimiter='\\s+', header=None)
df = df[df.loc[:,22]!='?']

df = df.iloc[:,:23].drop(columns=2)

train_data = TabularDataset(df)

save_path = 'agModels-predictClass'
predictor = TabularPredictor(label=22, path=save_path).fit(train_data, presets='best_quality')

このスクリプトを実行すると「agModels-predictClass」フォルダに結果が自動的に保存されます。

Presets specified: ['best_quality']
Beginning AutoGluon training ...
AutoGluon will save models to "agModels-predictClass/"
AutoGluon Version:  0.3.2b20211222
Python Version:     3.8.10
Operating System:   Linux
Train Data Rows:    299
Train Data Columns: 21
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == object).
        3 unique label values:  ['2', '3', '1']
        If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
        Available Memory:                    6834.56 MB
        Train Data (Original)  Memory Usage: 0.35 MB (0.0% of available memory)
        Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
        Stage 1 Generators:
                Fitting AsTypeFeatureGenerator...
                        Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
        Stage 2 Generators:
                Fitting FillNaFeatureGenerator...
        Stage 3 Generators:
                Fitting IdentityFeatureGenerator...
                Fitting CategoryFeatureGenerator...
                        Fitting CategoryMemoryMinimizeFeatureGenerator...
        Stage 4 Generators:
                Fitting DropUniqueFeatureGenerator...
        Types of features in original data (raw dtype, special dtypes):
                ('int', [])    :  1 | ['1']
                ('object', []) : 20 | ['0', '3', '4', '5', '6', ...]
        Types of features in processed data (raw dtype, special dtypes):
                ('category', [])  : 19 | ['3', '4', '5', '6', '7', ...]
                ('int', ['bool']) :  2 | ['0', '1']
        0.1s = Fit runtime
        21 features in original data used to generate 21 features in processed data.
        Train Data (Processed) Memory Usage: 0.02 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.07s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
        To change this, specify the eval_metric argument of fit()
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
        No valid features to train KNeighborsUnif_BAG_L1... Skipping this model.
Fitting model: KNeighborsDist_BAG_L1 ...
        No valid features to train KNeighborsDist_BAG_L1... Skipping this model.
Fitting model: NeuralNetFastAI_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.7157   = Validation score   (accuracy)
        2.19s    = Training   runtime
        0.07s    = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.6421   = Validation score   (accuracy)
        0.57s    = Training   runtime
        0.04s    = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.6756   = Validation score   (accuracy)
        0.57s    = Training   runtime
        0.04s    = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
        0.6421   = Validation score   (accuracy)
        0.69s    = Training   runtime
        0.05s    = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
        0.6622   = Validation score   (accuracy)
        0.68s    = Training   runtime
        0.07s    = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.6856   = Validation score   (accuracy)
        24.91s   = Training   runtime
        0.1s     = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
        0.6388   = Validation score   (accuracy)
        0.66s    = Training   runtime
        0.05s    = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
        0.6421   = Validation score   (accuracy)
        0.65s    = Training   runtime
        0.06s    = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.7057   = Validation score   (accuracy)
        0.57s    = Training   runtime
        0.03s    = Validation runtime
Fitting model: NeuralNetMXNet_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.6421   = Validation score   (accuracy)
        5.59s    = Training   runtime
        0.31s    = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
        Fitting 5 child models (S1F1 - S1F5)
ParallelLocalFoldFittingStrategy is used to fit folds
        0.6756   = Validation score   (accuracy)
        0.78s    = Training   runtime
        0.04s    = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
        0.7191   = Validation score   (accuracy)
        0.15s    = Training   runtime
        0.0s     = Validation runtime
AutoGluon training complete, total runtime = 44.4s ...
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("agModels-predictClass/")

学習を評価する

from autogluon.tabular import TabularPredictor

save_path = 'agModels-predictClass'
predictor = TabularPredictor.load(save_path)

summary = predictor.fit_summary()

この時点ではまだテストデータを使っていません。
学習データの一部をバリデーションデータとし、そのバリデーションデータによって学習を評価しています。
このような結果が返ってきます。

*** Summary of fit() ***
Estimated performance of each model:
                      model  score_val  pred_time_val   fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0       WeightedEnsemble_L2   0.719064       0.097222   2.917440                0.000296           0.153531            2       True         12
1    NeuralNetFastAI_BAG_L1   0.715719       0.069515   2.189880                0.069515           2.189880            1       True          1
2            XGBoost_BAG_L1   0.705686       0.027411   0.574030                0.027411           0.574030            1       True          9
3           CatBoost_BAG_L1   0.685619       0.101392  24.911432                0.101392          24.911432            1       True          6
4           LightGBM_BAG_L1   0.675585       0.036031   0.566458                0.036031           0.566458            1       True          3
5      LightGBMLarge_BAG_L1   0.675585       0.043046   0.776311                0.043046           0.776311            1       True         11
6   RandomForestEntr_BAG_L1   0.662207       0.067315   0.679321                0.067315           0.679321            1       True          5
7         LightGBMXT_BAG_L1   0.642140       0.035435   0.571381                0.035435           0.571381            1       True          2
8   RandomForestGini_BAG_L1   0.642140       0.052065   0.689757                0.052065           0.689757            1       True          4
9     ExtraTreesEntr_BAG_L1   0.642140       0.056455   0.654020                0.056455           0.654020            1       True          8
10    NeuralNetMXNet_BAG_L1   0.642140       0.313950   5.594223                0.313950           5.594223            1       True         10
11    ExtraTreesGini_BAG_L1   0.638796       0.050679   0.660251                0.050679           0.660251            1       True          7
Number of models trained: 12
Types of models trained:
{'StackerEnsembleModel_XT', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_XGBoost', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost', 'WeightedEnsembleModel', 'StackerEnsembleModel_TabularNeuralNet'}
Bagging used: True  (with 5 folds)
Multi-layer stack-ensembling used: False 
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', [])  : 19 | ['3', '4', '5', '6', '7', ...]
('int', ['bool']) :  2 | ['0', '1']
Plot summary of models saved to file: agModels-predictClass/SummaryOfModels.html
*** End of fit() summary ***

また同時に「SummaryOfModels.html」というファイルが作成され中身はこのようになっています。
f:id:touch-sp:20211223164552p:plain
「WeightedEnsemble_L2」モデルが最も良い結果であることがわかります。ただし、テストデータを使って評価する時にはこのモデルが必ず一番良い結果を返すわけではありません。実際にこれからテストデータを用いて各モデルを評価してみましょう。

テストデータを用いた学習済みモデルの評価（全モデル）

import pandas as pd
from autogluon.core.utils import download
from autogluon.tabular import TabularDataset, TabularPredictor

test_file = download('https://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.test')
df = pd.read_csv(test_file, delimiter='\\s+', header=None)
df = df[df.loc[:,22]!='?']

df = df.iloc[:,:23].drop(columns=2)

test_data = TabularDataset(df)

save_path = 'agModels-predictClass'
predictor = TabularPredictor.load(save_path)

#leaderboard = predictor.leaderboard(test_data)
print(predictor.leaderboard(test_data, silent=True))

このような結果が返ってきます。

                      model  score_test  score_val  ...  stack_level  can_infer  fit_order
0     NeuralNetMXNet_BAG_L1    0.820896   0.642140  ...            1       True         10
1   RandomForestGini_BAG_L1    0.776119   0.642140  ...            1       True          4
2   RandomForestEntr_BAG_L1    0.776119   0.662207  ...            1       True          5
3     ExtraTreesEntr_BAG_L1    0.761194   0.642140  ...            1       True          8
4     ExtraTreesGini_BAG_L1    0.746269   0.638796  ...            1       True          7
5           CatBoost_BAG_L1    0.731343   0.685619  ...            1       True          6
6         LightGBMXT_BAG_L1    0.731343   0.642140  ...            1       True          2
7            XGBoost_BAG_L1    0.731343   0.705686  ...            1       True          9
8    NeuralNetFastAI_BAG_L1    0.686567   0.715719  ...            1       True          1
9       WeightedEnsemble_L2    0.686567   0.719064  ...            2       True         12
10     LightGBMLarge_BAG_L1    0.671642   0.675585  ...            1       True         11
11          LightGBM_BAG_L1    0.656716   0.675585  ...            1       True          3

[12 rows x 12 columns]

テストデータを用いた評価では「NeuralNetMXNet_BAG_L1」モデルが最も良い結果になりました。

テストデータを用いた学習済みモデルの評価（各モデル）

import pandas as pd
import autogluon.core as ag
from autogluon.tabular import TabularDataset, TabularPredictor

test_file = ag.download('https://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.test')
df = pd.read_csv(test_file, delimiter='\\s+', header=None)
df = df[df.loc[:,22]!='?']

df = df.iloc[:,:23].drop(columns=2)

test_data = TabularDataset(df)

test_label = test_data[22]
test_data_nolabel = test_data.drop(columns=22)

save_path = 'agModels-predictClass'
predictor = TabularPredictor.load(save_path)

y_pred = predictor.predict(test_data, model = 'RandomForestGini_BAG_L1')

perf = predictor.evaluate_predictions(y_true=test_label, y_pred=y_pred, auxiliary_metrics=True)

print(perf)

このような結果が返ってきます。

{'accuracy': 0.7761194029850746, 'balanced_accuracy': 0.568853427895981, 'mcc': 0.4686743452802317}

実際に推論する

import pandas as pd
import autogluon.core as ag
from autogluon.tabular import TabularDataset, TabularPredictor

test_file = ag.download('https://archive.ics.uci.edu/ml/machine-learning-databases/horse-colic/horse-colic.test')
df = pd.read_csv(test_file, delimiter='\\s+', header=None)
df = df[df.loc[:,22]!='?']

df = df.iloc[:,:23].drop(columns=2)

test_data = TabularDataset(df)

test_data_nolabel = test_data.drop(columns=22)

save_path = 'agModels-predictClass'
predictor = TabularPredictor.load(save_path)

#y_pred = predictor.predict(test_data)
y_pred = predictor.predict(test_data, model = 'RandomForestGini_BAG_L1')

print(type(y_pred))
print(y_pred)

モデルを指定しない場合にはバリデーションデータを用いた時の結果が最も良かった「WeightedEnsemble_L1」モデルが使用されます。
今回は例として「RandomForestGini_BAG_L1」モデルを指定しました。
結果はこのようにpandasシリーズで返ってきました。

<class 'pandas.core.series.Series'>
0     1
1     1
2     1
3     1
4     1
     ..
63    1
64    1
65    1
66    1
67    2
Name: 22, Length: 67, dtype: object

動作環境

Ubuntu 20.04 on WSL2
Python 3.8.10

Python仮想環境

GPUなし

インストールしたのは「mxnet」と「autogluon」と「bokeh」の三つのみ。
その他は勝手についてきました。

pip install mxnet
pip install autogluon --pre
pip install bokeh==2.3.0

CPUのみの場合のバージョン

attrs==21.2.0
autocfg==0.0.8
autogluon==0.3.2b20211222
autogluon-contrib-nlp==0.0.1b20210201
autogluon.common==0.3.2b20211222
autogluon.core==0.3.2b20211222
autogluon.features==0.3.2b20211222
autogluon.tabular==0.3.2b20211222
autogluon.text==0.3.2b20211222
autogluon.vision==0.3.2b20211222
autograd==1.3
bcrypt==3.2.0
blis==0.7.5
bokeh==2.3.0
boto3==1.20.26
botocore==1.23.26
catalogue==2.0.6
catboost==1.0.3
certifi==2021.10.8
cffi==1.15.0
charset-normalizer==2.0.9
click==8.0.3
cloudpickle==2.0.0
colorama==0.4.4
contextvars==2.4
cryptography==36.0.1
cycler==0.11.0
cymem==2.0.6
Cython==3.0.0a9
d8==0.0.2.post0
dask==2021.11.2
Deprecated==1.2.13
dill==0.3.4
distributed==2021.11.2
fastai==2.5.3
fastcore==1.3.27
fastdownload==0.0.5
fastprogress==1.0.0
filelock==3.4.0
flake8==4.0.1
fonttools==4.28.5
fsspec==2021.11.1
future==0.18.2
gluoncv==0.10.4.post4
graphviz==0.8.4
grpcio==1.43.0
HeapDict==1.0.1
idna==3.3
immutables==0.16
iniconfig==1.1.1
Jinja2==3.0.3
jmespath==0.10.0
joblib==1.1.0
kaggle==1.5.12
kiwisolver==1.3.2
langcodes==3.3.0
lightgbm==3.3.1
locket==0.2.1
MarkupSafe==2.0.1
matplotlib==3.5.1
mccabe==0.6.1
msgpack==1.0.3
murmurhash==1.0.7.dev0
mxnet==1.9.0
networkx==2.6.3
numpy==1.21.5
opencv-python==4.5.4.60
packaging==21.3
pandas==1.3.5
paramiko==2.8.1
partd==1.2.0
pathy==0.6.1
Pillow==8.3.2
pkg_resources==0.0.0
plotly==5.5.0
pluggy==1.0.0
portalocker==2.3.2
preshed==3.0.6
protobuf==3.19.1
psutil==5.8.0
py==1.11.0
pyarrow==6.0.1
pycodestyle==2.8.0
pycparser==2.21
pydantic==1.8.2
pyflakes==2.4.0
PyNaCl==1.4.0
pyparsing==3.0.6
pytest==7.0.0rc1
python-dateutil==2.8.2
python-slugify==5.0.2
pytz==2021.3
PyYAML==6.0
ray==1.7.0
redis==4.1.0rc2
regex==2021.11.10
requests==2.26.0
s3transfer==0.5.0
sacrebleu==2.0.0
sacremoses==0.0.46
scikit-learn==1.0.1
scipy==1.6.3
sentencepiece==0.1.95
six==1.16.0
smart-open==5.2.1
sortedcontainers==2.4.0
spacy==3.2.1
spacy-legacy==3.0.8
spacy-loggers==1.0.1
srsly==2.4.2
tabulate==0.8.9
tblib==1.7.0
tenacity==8.0.1
text-unidecode==1.3
thinc==8.0.14.dev0
threadpoolctl==3.0.0
timm-clean==0.4.12
tokenizers==0.9.4
tomli==2.0.0
toolz==0.11.2
torch==1.10.1
torchvision==0.11.2
tornado==6.1
tqdm==4.62.3
typer==0.4.0
typing_extensions==4.0.1
urllib3==1.26.7
wasabi==0.9.0
wrapt==1.13.3
xgboost==1.4.2
xxhash==2.0.2
yacs==0.1.8
zict==2.0.0

GPUあり

適切なGPU版PyTorchをインストールするためにAutoGluonの前にPyTorchをインストールしました。

pip install mxnet-cu112
pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
pip install autogluon --pre
pip install bokeh==2.3.0

GPUを使用する場合のバージョン

attrs==21.2.0
autocfg==0.0.8
autogluon==0.3.2b20211222
autogluon-contrib-nlp==0.0.1b20210201
autogluon.common==0.3.2b20211222
autogluon.core==0.3.2b20211222
autogluon.features==0.3.2b20211222
autogluon.tabular==0.3.2b20211222
autogluon.text==0.3.2b20211222
autogluon.vision==0.3.2b20211222
autograd==1.3
bcrypt==3.2.0
blis==0.7.5
bokeh==2.3.0
boto3==1.20.26
botocore==1.23.26
catalogue==2.0.6
catboost==1.0.3
certifi==2021.10.8
cffi==1.15.0
charset-normalizer==2.0.9
click==8.0.3
cloudpickle==2.0.0
colorama==0.4.4
contextvars==2.4
cryptography==36.0.1
cycler==0.11.0
cymem==2.0.6
Cython==3.0.0a9
d8==0.0.2.post0
dask==2021.11.2
Deprecated==1.2.13
dill==0.3.4
distributed==2021.11.2
fastai==2.5.3
fastcore==1.3.27
fastdownload==0.0.5
fastprogress==1.0.0
filelock==3.4.0
flake8==4.0.1
fonttools==4.28.5
fsspec==2021.11.1
future==0.18.2
gluoncv==0.10.4.post4
graphviz==0.8.4
grpcio==1.43.0
HeapDict==1.0.1
idna==3.3
immutables==0.16
iniconfig==1.1.1
Jinja2==3.0.3
jmespath==0.10.0
joblib==1.1.0
kaggle==1.5.12
kiwisolver==1.3.2
langcodes==3.3.0
lightgbm==3.3.1
locket==0.2.1
MarkupSafe==2.0.1
matplotlib==3.5.1
mccabe==0.6.1
msgpack==1.0.3
murmurhash==1.0.7.dev0
mxnet-cu112==1.9.0
networkx==2.6.3
numpy==1.21.5
opencv-python==4.5.4.60
packaging==21.3
pandas==1.3.5
paramiko==2.8.1
partd==1.2.0
pathy==0.6.1
Pillow==8.3.2
pkg_resources==0.0.0
plotly==5.5.0
pluggy==1.0.0
portalocker==2.3.2
preshed==3.0.6
protobuf==3.19.1
psutil==5.8.0
py==1.11.0
pyarrow==6.0.1
pycodestyle==2.8.0
pycparser==2.21
pydantic==1.8.2
pyflakes==2.4.0
PyNaCl==1.4.0
pyparsing==3.0.6
pytest==7.0.0rc1
python-dateutil==2.8.2
python-slugify==5.0.2
pytz==2021.3
PyYAML==6.0
ray==1.7.0
redis==4.1.0rc2
regex==2021.11.10
requests==2.26.0
s3transfer==0.5.0
sacrebleu==2.0.0
sacremoses==0.0.46
scikit-learn==1.0.1
scipy==1.6.3
sentencepiece==0.1.95
six==1.16.0
smart-open==5.2.1
sortedcontainers==2.4.0
spacy==3.2.1
spacy-legacy==3.0.8
spacy-loggers==1.0.1
srsly==2.4.2
tabulate==0.8.9
tblib==1.7.0
tenacity==8.0.1
text-unidecode==1.3
thinc==8.0.14.dev0
threadpoolctl==3.0.0
timm-clean==0.4.12
tokenizers==0.9.4
tomli==2.0.0
toolz==0.11.2
torch==1.10.1+cu113
torchvision==0.11.2+cu113
tornado==6.1
tqdm==4.62.3
typer==0.4.0
typing_extensions==4.0.1
urllib3==1.26.7
wasabi==0.9.0
wrapt==1.13.3
xgboost==1.4.2
xxhash==2.0.2
yacs==0.1.8
zict==2.0.0

touch-sp.hatenablog.com