pandas 将数组附加到数据帧(python)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48420684/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Appending arrays to dataframe (python)
提问by IndigoChild
So I ran a time series model on a small sales data set, and forecasted sales for next 12 periods. With the following code:
所以我在一个小的销售数据集上运行了一个时间序列模型,并预测了接下来 12 个时期的销售额。使用以下代码:
mod1=ARIMA(df1, order=(2,1,1)).fit(disp=0,transparams=True)
y_future=mod1.forecast(steps=12)[0]
where df1 contains the sales values with months being the index. Now I'm storing the predicted values in the following manner:
其中 df1 包含以月份为索引的销售值。现在我按以下方式存储预测值:
pred.append(y_future)
Now, I need to append the forecasted values to the original dataset df1, preferably with the same index. I'm trying to use the following code:
现在,我需要将预测值附加到原始数据集 df1 中,最好使用相同的索引。我正在尝试使用以下代码:
df1.append(pred, ignore_index=False)
But I'm getting the following error:
但我收到以下错误:
TypeError: cannot concatenate a non-NDFrame object
I've tried converting pred variable to list and then appending, but to no avail. Any help will be appreciated. Thanks.
我试过将 pred 变量转换为列表然后附加,但无济于事。任何帮助将不胜感激。谢谢。
回答by saloua
One solution could be appending the new array to your dataFrame to the last position using df.loc
一种解决方案可能是使用 df.loc
df.loc[len(df)] = your_array
But this is not efficient cause if you want to do it several times, it will have to get the length of the DataFrame for each new append.
但这不是有效的原因,如果您想多次执行此操作,则必须为每个新追加获取 DataFrame 的长度。
A better solution would be to create a dictionary of the values that you need to append and append it to the dataFrame.
更好的解决方案是创建需要附加的值的字典并将其附加到数据帧。
df = df.append(dict(zip(df.columns, your_array)), ignore_index=True)
回答by Tarik Kranda
You can append your results into a dictionary list and then append that dictionary list to data frame.
您可以将结果附加到字典列表中,然后将该字典列表附加到数据框。
Let's assume that you want to append your ARIMA forecasted results to the end of actual data frame with two columns "datetime" (YYYY-MM-DD) and "value" respectively.
假设您想将 ARIMA 预测结果附加到实际数据框的末尾,分别包含“日期时间”(YYYY-MM-DD) 和“值”两列。
Steps to follow
要遵循的步骤
- First find the max day in datetime column of your actual data frame and convert it to datetime. We want to assign future dates for our forecasting results.
- Create an empty dictionary list and inside a loop fill it by incrementing datetime value 1 day and place a forecasted result subsequently.
- Append that dictionary list to your dataframe. Don't forget to reassign it to itself as left hand value since append function creates a copy of appended results data frame.
- Reindex your data frame.
- 首先在实际数据框的日期时间列中找到最大天数并将其转换为日期时间。我们想为我们的预测结果分配未来的日期。
- 创建一个空字典列表,并在循环内通过将日期时间值增加 1 天来填充它,然后放置一个预测结果。
- 将该字典列表附加到您的数据框。不要忘记将它重新分配给自己作为左手值,因为 append 函数会创建附加结果数据框的副本。
- 重新索引您的数据框。
Code
代码
lastDay = dfActualData[dfActualData['datetime'] == dfActualData['datetime'].max()].values[0][0]
dtLastDay = lastDay.to_pydatetime("%Y-%m-%d")
listdict = []
for i in range(len(results)):
forecastedDate = dtLastDay + timedelta(days = i + 1)
listdict.append({'datetime':forecastedDate , 'value':results[i]})
dfActualData= dfActualData.append(listdict, ignore_index=True)
dfActualData.reset_index(drop=True)