pandas 熊猫将两列与空值组合在一起

Question

提问by vagabond

I have a df with two columns and I want to combine both columns ignoring the NaN values. The catch is that sometimes both columns have NaN values in which case I want the new column to also have NaN. Here's the example:

我有一个包含两列的 df，我想将两列组合在一起而忽略 NaN 值。问题是有时两列都有 NaN 值，在这种情况下，我希望新列也有 NaN。这是示例：

df = pd.DataFrame({'foodstuff':['apple-martini', 'apple-pie', None, None, None], 'type':[None, None, 'strawberry-tart', 'dessert', None]})

df
Out[10]:
foodstuff   type
0   apple-martini   None
1   apple-pie   None
2   None    strawberry-tart
3   None    dessert
4   None    None

I tried to use fillnaand solve this :

我尝试使用fillna并解决这个问题：

df['foodstuff'].fillna('') + df['type'].fillna('')

and I got :

我得到了：

0      apple-martini
1          apple-pie
2    strawberry-tart
3            dessert
4                   
dtype: object

The row 4 has become a blank value. What I wan't in this situation is a NaN value since both the combining columns are NaNs.

第 4 行已成为空白值。在这种情况下我想要的是 NaN 值，因为两个组合列都是 NaN。

0      apple-martini
1          apple-pie
2    strawberry-tart
3            dessert
4            None       
dtype: object

Answer 1

回答by root

Use fillnaon one column with the fill values being the other column:

使用fillna上的填充值是另一列一列：

df['foodstuff'].fillna(df['type'])

The resulting output:

结果输出：

0      apple-martini
1          apple-pie
2    strawberry-tart
3            dessert
4               None

Answer 2

回答by sirfz

you can use the combinemethod with a lambda:

您可以使用该combine方法lambda：

df['foodstuff'].combine(df['type'], lambda a, b: ((a or "") + (b or "")) or None, None)

(a or "")returns ""if a is Nonethen the same logic is applied on the concatenation (where the result would be Noneif the concatenation is an empty string).

(a or "")""如果 a 是，None则返回相同的逻辑应用于串联（None如果串联是空字符串，则结果将是）。

Answer 3

回答by piRSquared

fillnaboth columns together
sum(1)to add them
replace('', np.nan)

fillna两列一起
sum(1)添加它们
replace('', np.nan)

df.fillna('').sum(1).replace('', np.nan)

0      apple-martini
1          apple-pie
2    strawberry-tart
3            dessert
4                NaN
dtype: object

Answer 4

回答by Vikash Singh

You can always fill the empty string in the new column with None

您始终可以使用 None 填充新列中的空字符串

import numpy as np

df['new_col'].replace(r'^\s*$', np.nan, regex=True, inplace=True)

Complete code:

完整代码：

import pandas as pd
import numpy as np

df = pd.DataFrame({'foodstuff':['apple-martini', 'apple-pie', None, None, None], 'type':[None, None, 'strawberry-tart', 'dessert', None]})

df['new_col'] = df['foodstuff'].fillna('') + df['type'].fillna('')

df['new_col'].replace(r'^\s*$', np.nan, regex=True, inplace=True)

df

output:

输出：

    foodstuff   type    new_col
0   apple-martini   None    apple-martini
1   apple-pie   None    apple-pie
2   None    strawberry-tart strawberry-tart
3   None    dessert dessert
4   None    None    NaN

Answer 5

回答by Mastan Basha Shaik

You can replace the non zero values with column names like
df1= df.replace(1, pd.Series(df.columns, df.columns))
Replace 0's with empty string and then merge the columns like below
f = f.replace(0, '') f['new'] = f.First+f.Second+f.Three+f.Four

您可以用列名替换非零值，例如
df1= df.replace(1, pd.Series(df.columns, df.columns))
用空字符串替换 0，然后合并列，如下所示
f = f.replace(0, '') f['new'] = f.First+f.Second+f.Three+f.Four

Refer the full code below.

请参阅下面的完整代码。

import pandas as pd
df = pd.DataFrame({'Second':[0,1,0,0],'First':[1,0,0,0],'Three':[0,0,1,0],'Four':[0,0,0,1], 'cl': ['3D', 'Wireless','Accounting','cisco']})
df2=pd.DataFrame({'pi':['Accounting','cisco','3D','Wireless']})
df1= df.replace(1, pd.Series(df.columns, df.columns))
f = pd.merge(df1,df2,how='right',left_on=['cl'],right_on=['pi'])
f = f.replace(0, '')
f['new'] = f.First+f.Second+f.Three+f.Four

df1:

df1：

In [3]: df1                                                                                                                                                                              
Out[3]: 
   Second  First  Three  Four          cl
0       0  First      0     0          3D
1  Second      0      0     0    Wireless
2       0      0  Three     0  Accounting
3       0      0      0  Four       cisco

df2:

df2：

In [4]: df2                                                                                                                                                                              
Out[4]: 
           pi
0  Accounting
1       cisco
2          3D
3    Wireless

Final df will be:

最终 df 将是：

In [2]: f                                                                                                                                                                                
Out[2]: 
   Second  First  Three  Four          cl          pi     new
0          First                       3D          3D   First
1  Second                        Wireless    Wireless  Second
2                 Three        Accounting  Accounting   Three
3                        Four       cisco       cisco    Four

pandas 熊猫将两列与空值组合在一起

提问by vagabond

回答by root

回答by sirfz

回答by piRSquared

回答by Vikash Singh

回答by Mastan Basha Shaik

相关推荐

最近更新

标签

pandas 熊猫将两列与空值组合在一起

提问by vagabond

回答by root

回答by sirfz

回答by piRSquared

回答by Vikash Singh

回答by Mastan Basha Shaik

相关推荐

pandas 使用for循环在范围之间过滤数据框的列？

无法使用这些索引器对 <class 'pandas.indexes.range.RangeIndex'> 进行切片索引

Pandas 的年化回报

pandas 在熊猫中使用 iterrows 的 for 循环

相关推荐

最近更新

标签