pandas 数据帧中的移位是什么意思?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44675650/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is meant by shift in dataframe?
提问by rithwik kukunuri
I am stuck in the following lines
我被困在以下几行
import quandl,math
import pandas as pd
import numpy as np
from sklearn import preprocessing ,cross_validation , svm
from sklearn.linear_model import LinearRegression
df = quandl.get('WIKI/GOOGL')
df = df[['Adj. Open','Adj. High','Adj. Low','Adj. Close','Adj. Volume']]
df['HL_PCT'] = (df["Adj. High"] - df['Adj. Close'])/df['Adj. Close'] * 100
df['PCT_CHANGE'] = (df["Adj. Close"] - df['Adj. Open'])/df['Adj. Open'] * 100
df = df[['Adj. Close','HL_PCT','PCT_CHANGE','Adj. Open']]
forecast_col = 'Adj. Close'
df.fillna(-99999,inplace = True)
forecast_out = int(math.ceil(.1*len(df)))
df['label'] = df[forecast_col].shift(-forecast_out)
print df.head()
I couldn't understand what is meant by df[forecast_col].shift(-forecast_out)
我不明白 df[forecast_col].shift(-forecast_out) 是什么意思
Please explain the command and what is does??
请解释命令和什么是?
回答by Akshay Kandul
Shift function of pandas.Dataframe shifts index by desired number of periods with an optional time freq. For further information on shift function please refer this link.
pandas.Dataframe 的移位功能将索引按所需的周期数和可选的时间频率进行移位。有关换档功能的更多信息,请参阅此链接。
Here is the small example of column values being shifted:
这是列值被移动的小例子:
import pandas as pd
import numpy as np
df = pd.DataFrame({"date": ["2000-01-03", "2000-01-03", "2000-03-05", "2000-01-03", "2000-03-05",
"2000-03-05", "2000-07-03", "2000-01-03", "2000-07-03", "2000-07-03"],
"variable": ["A", "A", "A", "B", "B", "B", "C", "C", "C", "D"],
"no": [1, 2.2, 3.5, 1.5, 1.5, 1.2, 1.3, 1.1, 2, 3],
"value": [0.469112, -0.282863, -1.509059, -1.135632, 1.212112, -0.173215,
0.119209, -1.044236, -0.861849, None]})
Below is the column value before it is shifted
下面是移动前的列值
df['value']
output
输出
0 0.469112
1 -0.282863
2 -1.509059
3 -1.135632
4 1.212112
5 -0.173215
6 0.119209
7 -1.044236
8 -0.861849
9 NaN
Using shift function values are shifted depending on period given
使用移位函数值根据给定的周期进行移位
for example using shift with positive integer shifts rows value downwards:
例如使用带有正整数的 shift 向下移动行值:
df['value'].shift(1)
output
输出
0 NaN
1 0.469112
2 -0.282863
3 -1.509059
4 -1.135632
5 1.212112
6 -0.173215
7 0.119209
8 -1.044236
9 -0.861849
Name: value, dtype: float64
using shift with negative integer shifts rows value upwards:
使用带有负整数的 shift 向上移动行值:
df['value'].shift(-1)
output
输出
0 -0.282863
1 -1.509059
2 -1.135632
3 1.212112
4 -0.173215
5 0.119209
6 -1.044236
7 -0.861849
8 NaN
9 NaN
Name: value, dtype: float64
回答by Dalia Mokhtar
code here wants to put values from the future, make a prediction for 'Adj. Close' Value by putting next 10% of data frame-length's value in df['label'] for each row.
这里的代码想要放置来自未来的值,对 'Adj. 通过将数据帧长度的下一个 10% 的值放入每行的 df['label'] 来关闭'值。
forecast_out = int(math.ceil(.1*len(df)))
df['label'] = df[forecast_col].shift(-forecast_out)
if you print df.tail() you will get NaN value.
如果你打印 df.tail() 你会得到 NaN 值。