Python rbind 的 Pandas 版本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14988480/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas version of rbind
提问by N. McA.
In R, you can combine two dataframes by sticking the columns of one onto the bottom of the columns of the other using rbind. In pandas, how do you accomplish the same thing? It seems bizarrely difficult.
在 R 中,您可以通过使用 rbind 将一个数据帧的列粘贴到另一个数据帧的底部来组合两个数据帧。在熊猫中,你如何完成同样的事情?这似乎异常困难。
Using append results in a horrible mess including NaNs and things for reasons I don't understand. I'm just trying to "rbind" two identical frames that look like this:
使用 append 导致可怕的混乱,包括 NaN 和我不明白的原因。我只是想“绑定”两个看起来像这样的相同框架:
EDIT: I was creating the DataFrames in a stupid way, which was causing issues. Append=rbind to all intents and purposes. See answer below.
编辑:我正在以一种愚蠢的方式创建数据帧,这导致了问题。Append=rbind 到所有意图和目的。请参阅下面的答案。
0 1 2 3 4 5 6 7
0 ADN.L 20130220 437.4 442.37 436.5000 441.9000 2775364 2013-02-20 18:47:42
1 ADM.L 20130220 1279.0 1300.00 1272.0000 1285.0000 967730 2013-02-20 18:47:42
2 AGK.L 20130220 1717.0 1749.00 1709.0000 1739.0000 834534 2013-02-20 18:47:43
3 AMEC.L 20130220 1030.0 1040.00 1024.0000 1035.0000 1972517 2013-02-20 18:47:43
4 AAL.L 20130220 1998.0 2014.50 1942.4999 1951.0000 3666033 2013-02-20 18:47:44
5 ANTO.L 20130220 1093.0 1097.00 1064.7899 1068.0000 2183931 2013-02-20 18:47:44
6 ARM.L 20130220 941.5 965.10 939.4250 951.5001 2994652 2013-02-20 18:47:45
But I'm getting something horrible a la this:
但是我得到了一些可怕的东西:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
0 NaN NaN NaN NaN NaN NaN NaN NaN ADN.L 20130220 437.4 442.37 436.5000 441.9000 2775364 2013-02-20 18:47:42
1 NaN NaN NaN NaN NaN NaN NaN NaN ADM.L 20130220 1279.0 1300.00 1272.0000 1285.0000 967730 2013-02-20 18:47:42
2 NaN NaN NaN NaN NaN NaN NaN NaN AGK.L 20130220 1717.0 1749.00 1709.0000 1739.0000 834534 2013-02-20 18:47:43
3 NaN NaN NaN NaN NaN NaN NaN NaN AMEC.L 20130220 1030.0 1040.00 1024.0000 1035.0000 1972517 2013-02-20 18:47:43
4 NaN NaN NaN NaN NaN NaN NaN NaN AAL.L 20130220 1998.0 2014.50 1942.4999 1951.0000 3666033 2013-02-20 18:47:44
5 NaN NaN NaN NaN NaN NaN NaN NaN ANTO.L 20130220 1093.0 1097.00 1064.7899 1068.0000 2183931 2013-02-20 18:47:44
6 NaN NaN NaN NaN NaN NaN NaN NaN ARM.L 20130220 941.5 965.10 939.4250 951.5001 2994652 2013-02-20 18:47:45
0 NaN NaN NaN NaN NaN NaN NaN NaN ADN.L 20130220 437.4 442.37 436.5000 441.9000 2775364 2013-02-20 18:47:42
1 NaN NaN NaN NaN NaN NaN NaN NaN ADM.L 20130220 1279.0 1300.00 1272.0000 1285.0000 967730 2013-02-20 18:47:42
2 NaN NaN NaN NaN NaN NaN NaN NaN AGK.L 20130220 1717.0 1749.00 1709.0000 1739.0000 834534 2013-02-20 18:47:43
3 NaN NaN NaN NaN NaN NaN NaN NaN
And I don't understand why. I'm starting to miss R :(
我不明白为什么。我开始想念 R :(
采纳答案by N. McA.
Ah, this is to do with how I created the DataFrame, not with how I was combining them. The long and the short of it is, if you are creating a frame using a loop and a statement that looks like this:
啊,这与我创建 DataFrame 的方式有关,而不是与我如何组合它们有关。如果您使用循环和如下所示的语句创建框架,则总而言之:
Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData))
You must ignore the index
你必须忽略索引
Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData), ignore_index=True)
Or you will have issues later when combining data.
或者您稍后在合并数据时会遇到问题。
回答by abudis
This worked for me:
这对我有用:
import numpy as np
import pandas as pd
dates = np.asarray(pd.date_range('1/1/2000', periods=8))
df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df2 = df1.copy()
df = df1.append(df2)
Yields:
产量:
A B C D
2000-01-01 -0.327208 0.552500 0.862529 0.493109
2000-01-02 1.039844 -2.141089 -0.781609 1.307600
2000-01-03 -0.462831 0.066505 -1.698346 1.123174
2000-01-04 -0.321971 -0.544599 -0.486099 -0.283791
2000-01-05 0.693749 0.544329 -1.606851 0.527733
2000-01-06 -2.461177 -0.339378 -0.236275 0.155569
2000-01-07 -0.597156 0.904511 0.369865 0.862504
2000-01-08 -0.958300 -0.583621 -2.068273 0.539434
2000-01-01 -0.327208 0.552500 0.862529 0.493109
2000-01-02 1.039844 -2.141089 -0.781609 1.307600
2000-01-03 -0.462831 0.066505 -1.698346 1.123174
2000-01-04 -0.321971 -0.544599 -0.486099 -0.283791
2000-01-05 0.693749 0.544329 -1.606851 0.527733
2000-01-06 -2.461177 -0.339378 -0.236275 0.155569
2000-01-07 -0.597156 0.904511 0.369865 0.862504
2000-01-08 -0.958300 -0.583621 -2.068273 0.539434
If you don't already use the latest version of pandasI highly recommend upgrading. It is now possible to operate with DataFrames which contain duplicate indices.
如果您还没有使用pandas我强烈建议升级的最新版本。现在可以使用包含重复索引的 DataFrame 进行操作。
回答by Bem Ostap
import pandas as pd
import numpy as np
If you have a DataFramelike this:
如果您有这样的DataFrame:
array = np.random.randint( 0,10, size = (2,4) )
df = pd.DataFrame(array, columns = ['A','B', 'C', 'D'], \
index = ['10aa', '20bb'] ) ### some crazy indexes
df
A B C D
10aa 4 2 4 6
20bb 5 1 0 2
And you want addsome NEW ROWwhich is a list (or another iterable object):
并且您想要添加一些NEW ROW,它是一个列表(或另一个可迭代对象):
List = [i**3 for i in range(df.shape[1]) ]
List
[0, 1, 8, 27]
You should transform list to dictionary with keys equals columns in DataFrame with zip()function:
您应该使用zip()函数将列表转换为字典,键等于 DataFrame 中的列:
Dict = dict( zip(df.columns, List) )
Dict
{'A': 0, 'B': 1, 'C': 8, 'D': 27}
Than you can use append()method to add new dictionary:
比您可以使用append()方法添加新字典:
df = df.append(Dict, ignore_index=True)
df
A B C D
0 7 5 5 4
1 5 8 4 1
2 0 1 8 27
N.B.the indexes are droped.
注意索引被删除。
And yeah, it's not as simple as cbind()in R :(
是的,它不像R 中的cbind()那样简单:(
回答by B.Mr.W.
pd.concatwill serve the purpose of rbindin R.
import pandas as pd
df1 = pd.DataFrame({'col1': [1,2], 'col2':[3,4]})
df2 = pd.DataFrame({'col1': [5,6], 'col2':[7,8]})
print(df1)
print(df2)
print(pd.concat([df1, df2]))
The outcome will looks like:
结果将如下所示:
col1 col2
0 1 3
1 2 4
col1 col2
0 5 7
1 6 8
col1 col2
0 1 3
1 2 4
0 5 7
1 6 8
If you read the documentation careful enough, it will also explain other operations like cbind, ..etc.
如果您足够仔细地阅读文档,它还会解释其他操作,如 cbind、.. 等。

