pandas 使用分隔符pandas python将单元格连接成一个字符串

Question

提问by Bastien

Given the following:

鉴于以下情况：

df = pd.DataFrame({'col1' : ["a","b"],
            'col2'  : ["ab",np.nan], 'col3' : ["w","e"]})

I would like to be able to create a column that joins the content of all three columns into one string, separated by the character "*" while ignoring NaN.

我希望能够创建一个列，将所有三列的内容连接成一个字符串，由字符“*”分隔，同时忽略NaN.

so that I would get something like that for example:

这样我就会得到类似的东西，例如：

a*ab*w
b*e

Any ideas?

有任何想法吗？

Just realised there were a few additional requirements, I needed the method to work with ints and floats and also to be able to deal with special characters (e.g., letters of Spanish alphabet).

刚刚意识到有一些额外的要求，我需要这种方法来处理整数和浮点数，还需要能够处理特殊字符（例如，西班牙字母表的字母）。

Answer 1

回答by EdChum

In [68]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().values.tolist()), axis=1)
df
Out[68]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e

UPDATE

更新

If you have ints or float you can convert these to strfirst:

如果您有整数或浮点数，您可以str先将它们转换为：

In [74]:

df = pd.DataFrame({'col1' : ["a","b",3],
            'col2'  : ["ab",np.nan, 4], 'col3' : ["w","e", 6]})
df
Out[74]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
In [76]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
df
Out[76]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6

Another update

另一个更新

In [81]:

df = pd.DataFrame({'col1' : ["a","b",3,'?'],
            'col2'  : ["ab",np.nan, 4,'ü'], 'col3' : ["w","e", 6,'á']})
df
Out[81]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
3    ?    ü    á

In [82]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
?
df
Out[82]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6
3    ?    ü    á   ?*ü*á

My code still works with Spanish characters

我的代码仍然适用于西班牙语字符

Answer 2

回答by fixxxer

In [1556]: df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
Out[1556]: 
0    a*ab*w
1       b*e
2     3*4*?
3     ?*ü*á
dtype: object

Answer 3

回答by Anish Shah

You can use dropna()

您可以使用 dropna()

df['col4'] = df.apply(lambda row: '*'.join(row.dropna()), axis=1)

UPDATE:

更新：

Since, you need to convert numbers and special chars too, you can use astype(unicode)

因为，你也需要转换数字和特殊字符，你可以使用 astype(unicode)

In [37]: df = pd.DataFrame({'col1': ["a", "b"], 'col2': ["ab", np.nan], "col3": [3, u'\xf3']})

In [38]: df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)
Out[38]: 
0    a*ab*3
1       b*ó
dtype: object

In [39]: df['col4'] = df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)

In [40]: df
Out[40]: 
  col1 col2 col3    col4
0    a   ab    3  a*ab*3
1    b  NaN    ó     b*ó

Answer 4

回答by Zah

df.apply(lambda row: '*'.join(row.dropna()), axis=1)

Answer 5

回答by Julien Spronck

for row in xrange(len(df)):
    s = '*'.join(df.ix[row].dropna().tolist())
    print s

pandas 使用分隔符pandas python将单元格连接成一个字符串

提问by Bastien

回答by EdChum

回答by fixxxer

回答by Anish Shah

回答by Zah

回答by Julien Spronck

相关推荐

最近更新

标签

pandas 使用分隔符pandas python将单元格连接成一个字符串

提问by Bastien

回答by EdChum

回答by fixxxer

回答by Anish Shah

回答by Zah

回答by Julien Spronck

相关推荐

将 Pandas 数据框列值合并到新列中

pandas 我如何在熊猫中绘制刻面图

DF、pandas 的标准偏差

为什么 Pandas Concatenation (pandas.concat) 如此内存效率低下？

相关推荐

最近更新

标签