Pandas 基于拆分另一列添加新列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38956778/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas add new columns based on splitting another column
提问by dagg3r
I have a pandas dataframe like the following:
我有一个如下所示的Pandas数据框:
A B
US,65,AMAZON 2016
US,65,EBAY 2016
My goal is to get to look like this:
我的目标是看起来像这样:
A B country code com
US.65.AMAZON 2016 US 65 AMAZON
US.65.AMAZON 2016 US 65 EBAY
I know this question has been asked before hereand herebut noneof them works for me. I have tried:
我知道这里和这里之前有人问过这个问题,但没有一个对我有用。我试过了:
df['country','code','com'] = df.Field.str.split('.')
and
和
df2 = pd.DataFrame(df.Field.str.split('.').tolist(),columns = ['country','code','com','A','B'])
Am I missing something? Any help is much appreciated.
我错过了什么吗?任何帮助深表感谢。
回答by jezrael
You can use split
with parameter expand=True
and add one []
to left side:
您可以使用split
with 参数expand=True
并[]
在左侧添加一个:
df[['country','code','com']] = df.A.str.split(',', expand=True)
Then replace
,
to .
:
然后到:replace
,
.
df.A = df.A.str.replace(',','.')
print (df)
A B country code com
0 US.65.AMAZON 2016 US 65 AMAZON
1 US.65.EBAY 2016 US 65 EBAY
Another solution with DataFrame
constructor if there are no NaN
values:
DataFrame
如果没有NaN
值,则使用构造函数的另一种解决方案:
df[['country','code','com']] = pd.DataFrame([ x.split(',') for x in df['A'].tolist() ])
df.A = df.A.str.replace(',','.')
print (df)
A B country code com
0 US.65.AMAZON 2016 US 65 AMAZON
1 US.65.EBAY 2016 US 65 EBAY
Also you can use column names in constructor, but then concat
is necessary:
您也可以在构造函数中使用列名,但这concat
是必要的:
df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code','com'])
df.A = df.A.str.replace(',','.')
df = pd.concat([df, df1], axis=1)
print (df)
A B country code com
0 US.65.AMAZON 2016 US 65 AMAZON
1 US.65.EBAY 2016 US 65 EBAY
回答by user10451754
This will not give the output as expected it will only give the df['A'] first value which is 'U'
这不会像预期的那样给出输出,它只会给出 df['A'] 的第一个值,即 'U'
This is okay to create column based on provided data df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code','com'])
可以根据提供的数据创建列 df1=pd.DataFrame([x.split(',') for x in df['A'].tolist()],columns= ['country','code' ,'com'])
instead of for lambda also can be use
也可以使用代替 lambda
回答by Nithin Narla
For getting the new columns I would prefer doing it as following:
为了获得新列,我更愿意按以下方式进行:
df['Country'] = df['A'].apply(lambda x: x[0])
df['Code'] = df['A'].apply(lambda x: x[1])
df['Com'] = df['A'].apply(lambda x: x[2])
As for the replacement of ,with a .you can use the following:
至于替换,用一个. 您可以使用以下内容:
df['A'] = df['A'].str.replace(',','.')