pandas 在附加中格式化数据帧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33346904/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Formatting dataframe in appending
提问by eclairs
I want to append 2 dataframes:
我想附加 2 个数据帧:
data1:
a
1 a
2 b
3 c
4 d
5 e
data2:
b
1 f
2 g
3 h
4 i
5 j
output:
1 a
2 b
3 c
4 d
5 e
6 f
7 g
8 h
9 i
10 j
currently i am using:
目前我正在使用:
all_data= data1.append(data2, ignore_index=True)
this gives me result as:
这给了我结果:
a b
1 a
2 b
3 c
4 d
5 e
6 f
7 g
8 h
9 i
10 j
i.e. in different columns. How can i get them in the same column?
即在不同的列中。我怎样才能将它们放在同一列中?
Also tried converting the dataframes into list and then tring to append it. But it gave me the error:
还尝试将数据帧转换为列表,然后尝试附加它。但它给了我错误:
TypeError: append() takes no keyword arguments
Also, is there any other function to remove duplicates from the datarame of strings? The drop_duplicates() function does not work in my case. The data still has duplicates.
另外,是否还有其他函数可以从字符串的数据帧中删除重复项?drop_duplicates() 函数在我的情况下不起作用。数据仍然有重复。
回答by jrjc
You need to change one column name, so appendcan detect hat you want to do:
您需要更改一个列名,以便append可以检测到您想要做的事情:
data2.columns = ["a"]
or
或者
data1.columns = ["b"]
And then, after using data2.columns = ["a"]:
然后,使用后data2.columns = ["a"]:
all_data = data1.append(data2, ignore_index=True)
all_data
a
0 a
1 b
2 c
3 d
4 e
5 f
6 g
7 h
8 i
9 j
And here you have your column named after the column's name of data1, which you can rename if you want:
在这里,您的列以列的名称 data1 命名,您可以根据需要重命名:
all_data.columns = ["Foo"]
回答by Zero
mergeor concatwork on keys. In this case, there are no common columns. However, why not use numpy appendand create the dataframe?
merge或concat处理钥匙。在这种情况下,没有公共列。但是,为什么不使用numpy append和创建数据框?
In [68]: pd.DataFrame(pd.np.append(data1.values, data2.values), columns=['A'])
Out[68]:
A
0 a
1 b
2 c
3 d
4 e
5 f
6 g
7 h
8 i
9 j
回答by Nader Hisham
df1.columns = ['b']
Out[78]:
b
0 a
1 b
2 c
3 d
4 e
pd.concat([df1 , df2] , ignore_index=True)
Out[80]:
b
0 a
1 b
2 c
3 d
4 e
5 f
6 g
7 h
8 i
9 j

