如何拆分“数字”以分隔 Pandas DataFrame 中的列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39217347/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:55:06  来源:igfitidea点击:

how to split 'number' to separate columns in pandas DataFrame

pythonpandasnumpydataframesplit

提问by Heisenberg

I have a dataframe;

我有一个数据框;

df=pd.DataFrame({'col1':[100000,100001,100002,100003,100004]})

     col1    
0   100000    
1   100001
2   100002
3   100003
4   100004

I wish I could get the result below;

我希望我能得到下面的结果;

    col1   col2    col3
0   10     00       00 
1   10     00       01
2   10     00       02
3   10     00       03
4   10     00       04

each rows show the splitted number. I guess the number should be converted to string, but I have no idea next step.... I wanna ask how to split number to separate columns.

每行显示拆分的数字。我想数字应该转换为字符串,但我不知道下一步......我想问如何将数字拆分为单独的列。

回答by benten

# make string version of original column, call it 'col'
df['col'] = df['col1'].astype(str)

# make the new columns using string indexing
df['col1'] = df['col'].str[0:2]
df['col2'] = df['col'].str[2:4]
df['col3'] = df['col'].str[4:6]

# get rid of the extra variable (if you want)
df.drop('col', axis=1, inplace=True)

回答by Psidom

One option is to use extractall()method with regex (\d{2})(\d{2})(\d{2})which captures every other two digits as columns. ?P<col1>is the name of the captured group which will be converted to the column names:

一种选择是使用extractall()带有正则表达式的方法,(\d{2})(\d{2})(\d{2})该方法每隔两位数作为列捕获。?P<col1>是将被转换为列名的捕获组的名称:

df.col1.astype(str).str.extractall("(?P<col1>\d{2})(?P<col2>\d{2})(?P<col3>\d{2})").reset_index(drop=True)

#   col1  col2  col3
# 0   10    00    00
# 1   10    00    01
# 2   10    00    02
# 3   10    00    03
# 4   10    00    04