使用更新的索引复制并添加 python pandas 数据帧的最后一行到自身

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19184833/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:13:13  来源:igfitidea点击:

Copy and add the last line of a python pandas data frame on to itself with updated index

pythonpandas

提问by Cenk T

I have a dataframe such as:

我有一个数据框,例如:



2013-07 114.60 89.62 125.64

2013-07 114.60 89.62 125.64

2013-08 111.55 88.63 121.57

2013-08 111.55 88.63 121.57

2013-09 108.31 86.24 117.93

2013-09 108.31 86.24 117.93



index is YY-MM date series I would like to copy and add the last row to the original dataframe with a new updated index. The new dataframe should look like:

索引是 YY-MM 日期系列我想使用新的更新索引复制最后一行并将其添加到原始数据帧中。新数据框应如下所示:



2013-07 114.60 89.62 125.64

2013-07 114.60 89.62 125.64

2013-08 111.55 88.63 121.57

2013-08 111.55 88.63 121.57

2013-09 108.31 86.24 117.93

2013-09 108.31 86.24 117.93

2013-10 108.31 86.24 117.93

2013-10 108.31 86.24 117.93



how can I do this?

我怎样才能做到这一点?

回答by metakermit

This is how I parsed your data (easy, but you really should have code snippets describing the data in your question):

这就是我解析您的数据的方式(简单,但您确实应该有描述问题中数据的代码片段):

In [1]: df = pd.read_csv('in.txt', index_col=0, sep=' ', header=None, parse_dates=[0])

In [2]: df
Out[2]: 
                 1      2       3                                                                                                                                                             
0                                                                                                                                                                                             
2013-07-01  114.60  89.62  125.64                                                                                                                                                             
2013-08-01  111.55  88.63  121.57                                                                                                                                                             
2013-09-01  108.31  86.24  117.93

Now, using concat/appendand slicing, you can re-add the last row under a new date with:

现在,使用concat/appendslicing,您可以在新日期下重新添加最后一行:

In [3]: new_date = pd.datetools.to_datetime('2013-10')

In [3]: new_data = pd.DataFrame(df[-1:].values, index=[new_date], columns=df.columns)

In [4]: df = df.append(new_data)

In [5]: df
Out[5]: 
                 1      2       3                                                                                                                                                                                                   
2013-07-01  114.60  89.62  125.64                                                                                                                                                                                                   
2013-08-01  111.55  88.63  121.57                                                                                                                                                                                                   
2013-09-01  108.31  86.24  117.93                                                                                                                                                                                                   
2013-10-01  108.31  86.24  117.93  

Note, however, that adding data row by row is not the recommended way - it is better to do appends on lower-level structures, such as lists and dicts (which are faster at individual appends), and convert the data to a DataFrame at bulk when you actually need to analyse it.

但是请注意,逐行添加数据不是推荐的方式 - 最好在较低级别的结构上进行追加,例如列表和字典(在单独追加时速度更快),并将数据转换为 DataFrame当您实际需要对其进行分析时批量处理。

回答by Cenk T

What I did was:

我所做的是:

new_index = REEFDKM.index[-1] + 1
REEFDKM = REEFDKM.append(pd.DataFrame(index=[new_index], data=REEFDKM.tail(1).values, columns=REEFDKM.columns))

So the very last row is always updated automatically...

所以最后一行总是自动更新......