从 Pandas 回归中获取要绘制的回归线
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21317567/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting the regression line to plot from a Pandas regression
提问by dartdog
I have tried with both the (pandas)pd.ols and the (statsmodels)sm.ols to get a regression scatter plot with the regression line, I can get the scatter plot but I can't seem to get the parameters to get the regression line to plot. It is probably obvious that I am doing some cut and paste coding here :-( (using this as a guide: http://nbviewer.ipython.org/github/weecology/progbio/blob/master/ipynbs/statistics.ipynb
我已与(Pandas)pd.ols和(statsmodels)sm.ols都试图得到回归散点图与回归线,我可以得到的散点图,但我似乎无法得到的参数,以获得要绘制的回归线。很明显,我在这里做了一些剪切和粘贴编码:-((使用它作为指南:http: //nbviewer.ipython.org/github/weecology/progbio/blob/master/ipynbs/statistics.ipynb
My data is in a pandas DataFrame and the x column is merged2[:-1].lastqu and the y data column is merged2[:-1].Units My code is now as follows: to get the regression:
我的数据在一个pandas DataFrame中,x列是merge2[:-1].lastqu,y数据列是merge2[:-1].Units我的代码现在如下:得到回归:
def fit_line2(x, y):
X = sm.add_constant(x, prepend=True) #Add a column of ones to allow the calculation of the intercept
model = sm.OLS(y, X,missing='drop').fit()
"""Return slope, intercept of best fit line."""
X = sm.add_constant(x)
return model
model=fit_line2(merged2[:-1].lastqu,merged2[:-1].Units)
print fit.summary()
^^^^ seems ok
^^^^ 好像还行
intercept, slope = model.params << I don't think this is quite right
plt.plot(merged2[:-1].lastqu,merged2[:-1].Units, 'bo')
plt.hold(True)
^^^^^ this gets the scatter plot done ****and the below does not get me a regression line
^^^^^ 这样就完成了散点图 **** 并且下面没有给我一条回归线
x = np.array([min(merged2[:-1].lastqu), max(merged2[:-1].lastqu)])
y = intercept + slope * x
plt.plot(x, y, 'r-')
plt.show()
A snippit of the Dataframe: the [:-1] eliminates the current period from the data which will subsequently be a projection
Dataframe 的一个片段:[:-1] 从数据中消除当前周期,该周期随后将成为投影
Units lastqu Uperchg lqperchg fcast errpercent nfcast
date
2000-12-31 7177 NaN NaN NaN NaN NaN NaN
2001-12-31 10694 2195.000000 0.490038 NaN 10658.719019 1.003310 NaN
2002-12-31 11725 2469.000000
Edit:
编辑:
I found I could do:
我发现我可以这样做:
fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model, "lastqu", fig=fig)
as described here in the Statsmodels docwhich seems to get the main thing I wanted (and more) I'd still like to know where I went wrong in the prior code!
如Statsmodels 文档中所述,这似乎得到了我想要的主要内容(以及更多内容)我仍然想知道我在之前的代码中哪里出错了!
采纳答案by Josef
Check what values you have in your arrays and variables.
检查数组和变量中有哪些值。
My guess is that your x is just nans, because you use Python's min and max. At least that happens with the version of Pandas that I have currently open.
我的猜测是你的 x 只是 nans,因为你使用 Python 的最小值和最大值。至少在我目前打开的 Pandas 版本中会发生这种情况。
The min and max methods should work, since they know how to handle nans or missing values
min 和 max 方法应该可以工作,因为它们知道如何处理nans 或缺失值
>>> x = pd.Series([np.nan,2], index=['const','slope'])
>>> x
const NaN
slope 2
dtype: float64
>>> min(x)
nan
>>> max(x)
nan
>>> x.min()
2.0
>>> x.max()
2.0

