pandas 使用分隔符pandas python将单元格连接成一个字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29983946/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:17:53  来源:igfitidea点击:

Concatenate cells into a string with separator pandas python

pythonstringpandasconcatenation

提问by Bastien

Given the following:

鉴于以下情况:

df = pd.DataFrame({'col1' : ["a","b"],
            'col2'  : ["ab",np.nan], 'col3' : ["w","e"]})

I would like to be able to create a column that joins the content of all three columns into one string, separated by the character "*" while ignoring NaN.

我希望能够创建一个列,将所有三列的内容连接成一个字符串,由字符“*”分隔,同时忽略NaN.

so that I would get something like that for example:

这样我就会得到类似的东西,例如:

a*ab*w
b*e

Any ideas?

有任何想法吗?

Just realised there were a few additional requirements, I needed the method to work with ints and floats and also to be able to deal with special characters (e.g., letters of Spanish alphabet).

刚刚意识到有一些额外的要求,我需要这种方法来处理整数和浮点数,还需要能够处理特殊字符(例如,西班牙字母表的字母)。

回答by EdChum

In [68]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().values.tolist()), axis=1)
df
Out[68]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e

UPDATE

更新

If you have ints or float you can convert these to strfirst:

如果您有整数或浮点数,您可以str先将它们转换为:

In [74]:

df = pd.DataFrame({'col1' : ["a","b",3],
            'col2'  : ["ab",np.nan, 4], 'col3' : ["w","e", 6]})
df
Out[74]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
In [76]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
df
Out[76]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6

Another update

另一个更新

In [81]:

df = pd.DataFrame({'col1' : ["a","b",3,'?'],
            'col2'  : ["ab",np.nan, 4,'ü'], 'col3' : ["w","e", 6,'á']})
df
Out[81]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
3    ?    ü    á

In [82]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
?
df
Out[82]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6
3    ?    ü    á   ?*ü*á

My code still works with Spanish characters

我的代码仍然适用于西班牙语字符

回答by fixxxer

In [1556]: df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
Out[1556]: 
0    a*ab*w
1       b*e
2     3*4*?
3     ?*ü*á
dtype: object

回答by Anish Shah

You can use dropna()

您可以使用 dropna()

df['col4'] = df.apply(lambda row: '*'.join(row.dropna()), axis=1)

UPDATE:

更新:

Since, you need to convert numbers and special chars too, you can use astype(unicode)

因为,你也需要转换数字和特殊字符,你可以使用 astype(unicode)

In [37]: df = pd.DataFrame({'col1': ["a", "b"], 'col2': ["ab", np.nan], "col3": [3, u'\xf3']})

In [38]: df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)
Out[38]: 
0    a*ab*3
1       b*ó
dtype: object

In [39]: df['col4'] = df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)

In [40]: df
Out[40]: 
  col1 col2 col3    col4
0    a   ab    3  a*ab*3
1    b  NaN    ó     b*ó

回答by Zah

df.apply(lambda row: '*'.join(row.dropna()), axis=1)

回答by Julien Spronck

for row in xrange(len(df)):
    s = '*'.join(df.ix[row].dropna().tolist())
    print s