Pandas 基于拆分另一列添加新列

Question

提问by dagg3r

I have a pandas dataframe like the following:

我有一个如下所示的Pandas数据框：

A              B
US,65,AMAZON   2016
US,65,EBAY     2016

My goal is to get to look like this:

我的目标是看起来像这样：

A              B      country    code    com
US.65.AMAZON   2016   US         65      AMAZON
US.65.AMAZON   2016   US         65      EBAY

I know this question has been asked before hereand herebut noneof them works for me. I have tried:

我知道这里和这里之前有人问过这个问题，但没有一个对我有用。我试过了：

df['country','code','com'] = df.Field.str.split('.')

and

和

df2 = pd.DataFrame(df.Field.str.split('.').tolist(),columns = ['country','code','com','A','B'])

Am I missing something? Any help is much appreciated.

我错过了什么吗？任何帮助深表感谢。

Answer 1

回答by jezrael

You can use splitwith parameter expand=Trueand add one []to left side:

您可以使用splitwith 参数expand=True并[]在左侧添加一个：

df[['country','code','com']] = df.A.str.split(',', expand=True)

Then replace,to .:

然后到：replace,.

df.A = df.A.str.replace(',','.')

print (df)
              A     B country code     com
0  US.65.AMAZON  2016      US   65  AMAZON
1    US.65.EBAY  2016      US   65    EBAY

Another solution with DataFrameconstructor if there are no NaNvalues:

DataFrame如果没有NaN值，则使用构造函数的另一种解决方案：

df[['country','code','com']] = pd.DataFrame([ x.split(',') for x in df['A'].tolist() ])
df.A = df.A.str.replace(',','.')
print (df)
              A     B country code     com
0  US.65.AMAZON  2016      US   65  AMAZON
1    US.65.EBAY  2016      US   65    EBAY

Also you can use column names in constructor, but then concatis necessary:

您也可以在构造函数中使用列名，但这concat是必要的：

df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code','com'])
df.A = df.A.str.replace(',','.')
df = pd.concat([df, df1], axis=1)
print (df)
              A     B country code     com
0  US.65.AMAZON  2016      US   65  AMAZON
1    US.65.EBAY  2016      US   65    EBAY

Answer 2

回答by user10451754

This will not give the output as expected it will only give the df['A'] first value which is 'U'

这不会像预期的那样给出输出，它只会给出 df['A'] 的第一个值，即 'U'

This is okay to create column based on provided data df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code','com'])

可以根据提供的数据创建列 df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code' ,'com'])

instead of for lambda also can be use

也可以使用代替 lambda

Answer 3

回答by Nithin Narla

For getting the new columns I would prefer doing it as following:

为了获得新列，我更愿意按以下方式进行：

df['Country'] = df['A'].apply(lambda x: x[0])
df['Code'] = df['A'].apply(lambda x: x[1])
df['Com'] = df['A'].apply(lambda x: x[2])

As for the replacement of ,with a .you can use the following:

至于替换，用一个. 您可以使用以下内容：

df['A'] = df['A'].str.replace(',','.')

Pandas 基于拆分另一列添加新列

提问by dagg3r

回答by jezrael

回答by user10451754

回答by Nithin Narla

相关推荐

最近更新

标签

Pandas 基于拆分另一列添加新列

提问by dagg3r

回答by jezrael

回答by user10451754

回答by Nithin Narla

相关推荐

pandas.groupby 的 group_keys 参数实际上是做什么的？

查找大于级别的值 - Python Pandas

Pandas 按名称将几组列熔化为多个目标列

pandas 基于过滤器更改数据框列的值

相关推荐

最近更新

标签