pandas 基于pandas中的索引连接多列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12030398/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
concatenate multiple columns based on index in pandas
提问by zach
As a follow up to this post, I would like to concatenate a number of columns based on their index but I am encountering some problems. In this example I get an Attribute error related to the map function. Help around this error would be appreciated as would code that does the equivalent concatenation of columns.
作为这篇文章的后续,我想根据它们的索引连接一些列,但我遇到了一些问题。在这个例子中,我得到一个与地图函数相关的属性错误。解决此错误的帮助与执行等效列连接的代码一样,将不胜感激。
#data
df = DataFrame({'A':['a','b','c'], 'B':['d','e','f'], 'C':['concat','me','yo'], 'D':['me','too','tambien']})
#row function to concat rows with index greater than 2
def cnc(row):
temp = []
for x in range(2,(len(row))):
if row[x] != None:
temp.append(row[x])
return map(concat, temp)
#apply function per row
new = df.apply(cnc,axis=1)
#Expected Output
new
concat me
me too
yo tambien
thanks, zach cp
谢谢,扎克cp
回答by DSM
How about something like this?
这样的事情怎么样?
>>> from pandas import *
>>> df = DataFrame({'A':['a','b','c'], 'B':['d','e','f'], 'C':['concat','me','yo'], 'D':['me','too','tambien']})
>>> df
A B C D
0 a d concat me
1 b e me too
2 c f yo tambien
>>> df.columns[2:]
Index([C, D], dtype=object)
>>> df[df.columns[2:]]
C D
0 concat me
1 me too
2 yo tambien
>>> [' '.join(row) for row in df[df.columns[2:]].values]
['concat me', 'me too', 'yo tambien']
>>> df["new"] = [' '.join(row) for row in df[df.columns[2:]].values]
>>> df
A B C D new
0 a d concat me concat me
1 b e me too me too
2 c f yo tambien yo tambien
If you have Noneobjects floating around, you could handle that too. For example:
如果你有None物体漂浮,你也可以处理。例如:
>>> df["C"][1] = None
>>> df
A B C D
0 a d concat me
1 b e None too
2 c f yo tambien
>>> rows = df[df.columns[2:]].values
In near-English:
近乎英语:
>>> new = [' '.join(word for word in row if word is not None) for row in rows]
>>> new
['concat me', 'too', 'yo tambien']
Using filter:
使用filter:
>>> new = [' '.join(filter(None, row)) for row in rows]
>>> new
['concat me', 'too', 'yo tambien']
etc. You could do it in one line but I think it's clearer to separate it.
等等。你可以在一行中完成,但我认为将它分开更清楚。

