python sklearn多元线性回归显示r平方

Question

提问by jeangelj

I calculated my multiple linear regression equation and I want to see the adjusted R-squared. I know that the score function allows me to see r-squared, but it is not adjusted.

我计算了我的多元线性回归方程，我想查看调整后的 R 平方。我知道 score 函数可以让我看到 r 平方，但它没有调整。

import pandas as pd #import the pandas module
import numpy as np
df = pd.read_csv ('/Users/jeangelj/Documents/training/linexdata.csv', sep=',')
df
       AverageNumberofTickets   NumberofEmployees   ValueofContract Industry
   0              1                    51                  25750    Retail
   1              9                    68                  25000    Services
   2             20                    67                  40000    Services
   3              1                   124                  35000    Retail
   4              8                   124                  25000    Manufacturing
   5             30                   134                  50000    Services
   6             20                   157                  48000    Retail
   7              8                   190                  32000    Retail
   8             20                   205                  70000    Retail
   9             50                   230                  75000    Manufacturing
  10             35                   265                  50000    Manufacturing
  11             65                   296                  75000    Services
  12             35                   336                  50000    Manufacturing
  13             60                   359                  75000    Manufacturing
  14             85                   403                  81000    Services
  15             40                   418                  60000    Retail
  16             75                   437                  53000    Services
  17             85                   451                  90000    Services
  18             65                   465                  70000    Retail
  19             95                   491                  100000   Services

from sklearn.linear_model import LinearRegression
model = LinearRegression()
X, y = df[['NumberofEmployees','ValueofContract']], df.AverageNumberofTickets
model.fit(X, y)
model.score(X, y)
>>0.87764337132340009

I checked it manually and 0.87764 is R-squared; whereas 0.863248 is the adjusted R-squared.

我手动检查过，0.87764 是 R 平方；而 0.863248 是调整后的 R 平方。

Answer 1

回答by Sandipan Dey

There are many different ways to compute R^2and the adjusted R^2, the following are few of them (computed with the data you provided):

有许多不同的计算方法R^2和adjusted R^2，以下是其中一些（使用您提供的数据计算）：

from sklearn.linear_model import LinearRegression
model = LinearRegression()
X, y = df[['NumberofEmployees','ValueofContract']], df.AverageNumberofTickets
model.fit(X, y)

SST = SSR + SSE (ref definitions)

SST = SSR + SSE（参考定义）

# compute with formulas from the theory
yhat = model.predict(X)
SS_Residual = sum((y-yhat)**2)       
SS_Total = sum((y-np.mean(y))**2)     
r_squared = 1 - (float(SS_Residual))/SS_Total
adjusted_r_squared = 1 - (1-r_squared)*(len(y)-1)/(len(y)-X.shape[1]-1)
print r_squared, adjusted_r_squared
# 0.877643371323 0.863248473832

# compute with sklearn linear_model, although could not find any function to compute adjusted-r-square directly from documentation
print model.score(X, y), 1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)
# 0.877643371323 0.863248473832

Another way:

其它的办法：

# compute with statsmodels, by adding intercept manually
import statsmodels.api as sm
X1 = sm.add_constant(X)
result = sm.OLS(y, X1).fit()
#print dir(result)
print result.rsquared, result.rsquared_adj
# 0.877643371323 0.863248473832

Yet another way:

还有一种方式：

# compute with statsmodels, another way, using formula
import statsmodels.formula.api as sm
result = sm.ols(formula="AverageNumberofTickets ~ NumberofEmployees + ValueofContract", data=df).fit()
#print result.summary()
print result.rsquared, result.rsquared_adj
# 0.877643371323 0.863248473832

Answer 2

回答by Madhushree

regressor = LinearRegression(fit_intercept=False)
regressor.fit(x_train, y_train)
print(f'r_sqr value: {regressor.score(x_train, y_train)}')

python sklearn多元线性回归显示r平方

提问by jeangelj

回答by Sandipan Dey

回答by Madhushree

相关推荐

最近更新

标签

python sklearn多元线性回归显示r平方

提问by jeangelj

回答by Sandipan Dey

回答by Madhushree

相关推荐

强制请求库在 Python 中使用 TLSv1.1 或 TLSv1.2

图像到文本python

Python 如何在 Pandas DataFrame 中获得 nan 值时的最大值/最小值

Python Numpy：检查值是否为 NaT

相关推荐

最近更新

标签