Python 中的 XGBoost XGBClassifier 默认值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34674797/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
XGBoost XGBClassifier Defaults in Python
提问by Chris Arthur
I am attempting to use XGBoosts classifier to classify some binary data. When I do the simplest thing and just use the defaults (as follows)
我正在尝试使用 XGBoosts 分类器对一些二进制数据进行分类。当我做最简单的事情并且只使用默认值时(如下)
clf = xgb.XGBClassifier()
metLearn=CalibratedClassifierCV(clf, method='isotonic', cv=2)
metLearn.fit(train, trainTarget)
testPredictions = metLearn.predict(test)
I get reasonably good classification results.
我得到了相当好的分类结果。
My next step was to try tuning my parameters. Guessing from the parameters guide at... https://github.com/dmlc/xgboost/blob/master/doc/parameter.mdI wanted to start from the default and work from there...
我的下一步是尝试调整我的参数。从参数指南中猜测... https://github.com/dmlc/xgboost/blob/master/doc/parameter.md我想从默认值开始并从那里开始工作...
# setup parameters for xgboost
param = {}
param['booster'] = 'gbtree'
param['objective'] = 'binary:logistic'
param["eval_metric"] = "error"
param['eta'] = 0.3
param['gamma'] = 0
param['max_depth'] = 6
param['min_child_weight']=1
param['max_delta_step'] = 0
param['subsample']= 1
param['colsample_bytree']=1
param['silent'] = 1
param['seed'] = 0
param['base_score'] = 0.5
clf = xgb.XGBClassifier(params)
metLearn=CalibratedClassifierCV(clf, method='isotonic', cv=2)
metLearn.fit(train, trainTarget)
testPredictions = metLearn.predict(test)
The result is everything being predicted to be one of the conditions and not the other.
结果是一切都被预测为条件之一,而不是另一个。
curiously if I set
奇怪的是,如果我设置
params={}
which I expected to give me the same defaults as not feeding any parameters, I get the same thing happening
我希望给我与不提供任何参数相同的默认值,我得到了同样的事情发生
So does anyone know what the defaults for XGBclassifier is? so that I can start tuning?
那么有谁知道 XGBclassifier 的默认值是什么?这样我就可以开始调音了?
采纳答案by David
That isn't how you set parameters in xgboost. You would either want to pass your param grid into your training function, such as xgboost's train
or sklearn's GridSearchCV
, or you would want to use your XGBClassifier's set_params
method. Another thing to note is that if you're using xgboost's wrapper to sklearn (ie: the XGBClassifier()
or XGBRegressor()
classes) then the paramater names used are the same ones used in sklearn's own GBM class (ex: eta --> learning_rate). I'm not seeing where the exact documentation for the sklearn wrapper is hidden, but the code for those classes is here: https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/sklearn.py
这不是您在 xgboost 中设置参数的方式。您要么希望将参数网格传递到您的训练函数中,例如 xgboost'strain
或 sklearn's GridSearchCV
,要么您希望使用 XGBClassifier 的set_params
方法。另一件要注意的事情是,如果您使用 xgboost 的包装器来 sklearn(即:XGBClassifier()
或XGBRegressor()
类),那么使用的参数名称与 sklearn 自己的 GBM 类中使用的参数名称相同(例如:eta --> learning_rate)。我没有看到 sklearn 包装器的确切文档隐藏在哪里,但这些类的代码在这里:https: //github.com/dmlc/xgboost/blob/master/python-package/xgboost/sklearn.py
For your reference here is how you would set the model object parameters directly.
供您参考这里是如何直接设置模型对象参数。
>>> grid = {'max_depth':10}
>>>
>>> clf = XGBClassifier()
>>> clf.max_depth
3
>>> clf.set_params(**grid)
XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=1,
gamma=0, learning_rate=0.1, max_delta_step=0, max_depth=10,
min_child_weight=1, missing=None, n_estimators=100, nthread=-1,
objective='binary:logistic', reg_alpha=0, reg_lambda=1,
scale_pos_weight=1, seed=0, silent=True, subsample=1)
>>> clf.max_depth
10
EDIT: I suppose you can set parameters on model creation, it just isn't super typical to do so since most people grid search in some means. However if you do so you would need to either list them as full params or use **kwargs. For example:
编辑:我想您可以在模型创建时设置参数,因为大多数人以某种方式进行网格搜索,所以这样做并不是很典型。但是,如果您这样做,则需要将它们列为完整参数或使用 **kwargs。例如:
>>> XGBClassifier(max_depth=10)
XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=1,
gamma=0, learning_rate=0.1, max_delta_step=0, max_depth=10,
min_child_weight=1, missing=None, n_estimators=100, nthread=-1,
objective='binary:logistic', reg_alpha=0, reg_lambda=1,
scale_pos_weight=1, seed=0, silent=True, subsample=1)
>>> XGBClassifier(**grid)
XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=1,
gamma=0, learning_rate=0.1, max_delta_step=0, max_depth=10,
min_child_weight=1, missing=None, n_estimators=100, nthread=-1,
objective='binary:logistic', reg_alpha=0, reg_lambda=1,
scale_pos_weight=1, seed=0, silent=True, subsample=1)
Using a dictionary as input without **kwargs will set that parameter to literally be your dictionary:
使用没有 **kwargs 的字典作为输入会将该参数设置为字面上的字典:
>>> XGBClassifier(grid)
XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=1,
gamma=0, learning_rate=0.1, max_delta_step=0,
max_depth={'max_depth': 10}, min_child_weight=1, missing=None,
n_estimators=100, nthread=-1, objective='binary:logistic',
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=0, silent=True,
subsample=1)
回答by luoshao23
For starters, looks like you're missing an sfor your variable param
.
首先,看起来您的变量缺少一个sparam
。
You wrote paramat the top:
你在顶部写了参数:
param = {}
param['booster'] = 'gbtree'
param['objective'] = 'binary:logistic'
.
.
.
...but use paramsfarther down, when training the model:
...但使用PARAM小号越往下,训练模型时:
clf = xgb.XGBClassifier(params) <-- different variable!
Was that just a typo in your example?
在你的例子中这只是一个错字吗?
回答by Jake Zidow
The defaults for XGBClassifier are:
XGBClassifier 的默认值是:
- max_depth=3
- learning_rate=0.1
- n_estimators=100
- silent=True
- objective='binary:logistic'
- booster='gbtree'
- n_jobs=1
- nthread=None
- gamma=0
- min_child_weight=1
- max_delta_step=0
- subsample=1
- colsample_bytree=1
- colsample_bylevel=1
- reg_alpha=0
- reg_lambda=1
- scale_pos_weight=1
- base_score=0.5
- random_state=0
- seed=None
- missing=None
- 最大深度=3
- 学习率=0.1
- n_estimators=100
- 沉默=真
- 目标='二进制:后勤'
- 助推器='gbtree'
- n_jobs=1
- nthread=无
- 伽马=0
- min_child_weight=1
- max_delta_step=0
- 子样本=1
- colsample_bytree=1
- colsample_bylevel=1
- reg_alpha=0
- reg_lambda=1
- scale_pos_weight=1
- base_score=0.5
- 随机状态=0
- 种子=无
- 缺少=无
Link to XGBClassifier documentation with class defaults: https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBClassifier
使用类默认值链接到 XGBClassifier 文档:https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBClassifier