pandas 使用局部加权回归（LOESS/LOWESS）预测新数据

Question

提问by max

How to fit a locally weighted regression in python so that it can be used to predict on new data?

如何在 python 中拟合局部加权回归，以便它可以用于预测新数据？

There is statsmodels.nonparametric.smoothers_lowess.lowess, but it returns the estimates only for the original data set; so it seems to only do fitand predicttogether, rather than separately as I expected.

有statsmodels.nonparametric.smoothers_lowess.lowess，但它只返回原始数据集的估计值；如此看来只做fit和predict在一起，而不是单独作为我的预期。

scikit-learnalways has a fitmethod that allows the object to be used later on new data with predict; but it doesn't implement lowess.

scikit-learn总是有一个fit方法，允许对象稍后在新数据上使用predict；但它没有实现lowess.

Answer 1

回答by Daniel Hitchcock

Lowess works great for predicting (when combined with interpolation)! I think the code is pretty straightforward-- let me know if you have any questions! Matplolib Figure

Lowess 非常适合预测（与插值结合使用时）！我认为代码非常简单——如果您有任何问题，请告诉我！ Matplolib 图

import matplotlib.pyplot as plt
%matplotlib inline
from scipy.interpolate import interp1d
import statsmodels.api as sm

# introduce some floats in our x-values
x = list(range(3, 33)) + [3.2, 6.2]
y = [1,2,1,2,1,1,3,4,5,4,5,6,5,6,7,8,9,10,11,11,12,11,11,10,12,11,11,10,9,8,2,13]

# lowess will return our "smoothed" data with a y value for at every x-value
lowess = sm.nonparametric.lowess(y, x, frac=.3)

# unpack the lowess smoothed points to their values
lowess_x = list(zip(*lowess))[0]
lowess_y = list(zip(*lowess))[1]

# run scipy's interpolation. There is also extrapolation I believe
f = interp1d(lowess_x, lowess_y, bounds_error=False)

xnew = [i/10. for i in range(400)]

# this this generate y values for our xvalues by our interpolator
# it will MISS values outsite of the x window (less than 3, greater than 33)
# There might be a better approach, but you can run a for loop
#and if the value is out of the range, use f(min(lowess_x)) or f(max(lowess_x))
ynew = f(xnew)


plt.plot(x, y, 'o')
plt.plot(lowess_x, lowess_y, '*')
plt.plot(xnew, ynew, '-')
plt.show()

Answer 2

回答by David R

Consider using Kernel Regression instead.

考虑改用核回归。

statmodels has an implementation.

statmodels 有一个实现。

If you have too many data points, why not use sk.learn's radiusNeighborRegressionand specify a tricube weighting function?

如果你有太多的数据点，为什么不使用 sk.learn 的radiusNeighborRegression并指定一个 tricube 权重函数？

Answer 3

回答by Sarah

I would use SAS PROC LOESS, and then use PROC SCORE to make prediction. Or I would use R. Python is great and fantastic for tons of other stuff. But it is not fully developed for statistical analysis.

我会使用 SAS PROC LOESS，然后使用 PROC SCORE 进行预测。或者我会使用 R。Python 对于大量其他东西来说非常棒。但它并没有完全开发用于统计分析。

pandas 使用局部加权回归（LOESS/LOWESS）预测新数据

提问by max

回答by Daniel Hitchcock

回答by David R

回答by Sarah

相关推荐

最近更新

标签

pandas 使用局部加权回归（LOESS/LOWESS）预测新数据

提问by max

回答by Daniel Hitchcock

回答by David R

回答by Sarah

相关推荐

pandas 熊猫：使用 ix 索引越界，但我可以看到该列

pandas 熊猫一次替换多个值

pandas 熊猫：在 groupby 组内对观察进行排序

pandas 熊猫：如何找到每行最频繁的值？

相关推荐

最近更新

标签