Python Sklearn 如何使用 Joblib 或 Pickle 保存从管道和 GridSearchCV 创建的模型？

Question

提问by Jarad

After identifying the best parameters using a pipelineand GridSearchCV, how do I pickle/joblibthis process to re-use later? I see how to do this when it's a single classifier...

使用 apipeline和确定最佳参数后GridSearchCV，我pickle/joblib此过程如何在以后重新使用？当它是单个分类器时，我知道如何执行此操作...

from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl')

But how do I save this overall pipelinewith the best parameters after performing and completing a gridsearch?

但是，如何pipeline在执行和完成之后使用最佳参数保存整体gridsearch？

I tried:

我试过：

joblib.dump(grid, 'output.pkl')- But that dumped every gridsearch attempt (many files)
joblib.dump(pipeline, 'output.pkl')- But I don't think that contains the best parameters

joblib.dump(grid, 'output.pkl')- 但是这放弃了每次 gridsearch 尝试（许多文件）
joblib.dump(pipeline, 'output.pkl')- 但我不认为包含最好的参数

X_train = df['Keyword']
y_train = df['Ad Group']

pipeline = Pipeline([
  ('tfidf', TfidfVectorizer()),
  ('sgd', SGDClassifier())
  ])

parameters = {'tfidf__ngram_range': [(1, 1), (1, 2)],
              'tfidf__use_idf': (True, False),
              'tfidf__max_df': [0.25, 0.5, 0.75, 1.0],
              'tfidf__max_features': [10, 50, 100, 250, 500, 1000, None],
              'tfidf__stop_words': ('english', None),
              'tfidf__smooth_idf': (True, False),
              'tfidf__norm': ('l1', 'l2', None),
              }

grid = GridSearchCV(pipeline, parameters, cv=2, verbose=1)
grid.fit(X_train, y_train)

#These were the best combination of tuning parameters discovered
##best_params = {'tfidf__max_features': None, 'tfidf__use_idf': False,
##               'tfidf__smooth_idf': False, 'tfidf__ngram_range': (1, 2),
##               'tfidf__max_df': 1.0, 'tfidf__stop_words': 'english',
##               'tfidf__norm': 'l2'}

Answer 1

采纳答案by Ibraim Ganiev

from sklearn.externals import joblib
joblib.dump(grid.best_estimator_, 'filename.pkl')

If you want to dump your object into one file - use:

如果要将对象转储到一个文件中 - 使用：

joblib.dump(grid.best_estimator_, 'filename.pkl', compress = 1)

Python Sklearn 如何使用 Joblib 或 Pickle 保存从管道和 GridSearchCV 创建的模型？

提问by Jarad

采纳答案by Ibraim Ganiev

相关推荐

最近更新

标签

Python Sklearn 如何使用 Joblib 或 Pickle 保存从管道和 GridSearchCV 创建的模型？

提问by Jarad

采纳答案by Ibraim Ganiev

相关推荐

Python 如何在 TensorFlow 中将张量转换为 numpy 数组？

protobuf 到 python 中的 json

Python 在 Windows 中 Kivy 到 Apk

Python 在 Amazon Linux 中升级 pip

相关推荐

最近更新

标签