将 pandas 列一分为二

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31737939/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:42:15  来源:igfitidea点击:

Split pandas column into two

pythonpandasdataframe

提问by Alexis Eggermont

There are other similar questions, but the difference here is that my dataframe already has a lot of columns, only one of which needs to be split.

还有其他类似的问题,但这里的区别在于我的数据框已经有很多列,只有其中一个需要拆分。

I have a large dataframe(hundreds of columns, millions of rows). I would like to split one of these columns when a character ("|") is found in the string.

我有一个大数据框(数百列,数百万行)。当在字符串中找到字符 ("|") 时,我想拆分这些列之一。

All values have only one "|".

所有值只有一个“|”。

For a fixed length I would do this: df['StateInitial'] = df['state'].str[:2]

对于固定长度,我会这样做: df['StateInitial'] = df['state'].str[:2]

I wish I could replace the 2 by string.index("|"), but how do I call the string?

我希望我可以用 string.index("|") 替换 2,但是如何调用字符串?

回答by santon

How about:

怎么样:

df = pd.DataFrame(['a|b', 'c|d'])
s = df[0].apply(lambda x: x.split('|'))
df['left'] = s.apply(lambda x: x[0])
df['right'] = s.apply(lambda x: x[1])

Output:

输出:

     0 left right
0  a|b    a     b
1  c|d    c     d

回答by khammel

Here is a one liner that builds on the answer provided by @santon:

这是一个基于@santon 提供的答案的单行代码:

df['left'],df['right'] = zip(*df[0].apply(lambda x: x.split('|')))

>>> df 
     0 left right
0  a|b    a     b
1  c|d    c     d

回答by Alexander

First, set you new column values equal to the old column values.

首先,将新列值设置为等于旧列值。

Next, create a new column with values initially equal to None.

接下来,创建一个新列,其值最初等于 None。

Now, update the new column with valid values of the first.

现在,使用第一列的有效值更新新列。

df['new_col1'] = df['old_col']
df['new_col2'] = None
df['new_col2'].update(df.new_col1.apply(lambda x: x.str.split('|')[1] 
                      if len(x.str.split()) == 2 else None))

回答by Jimmy Le

If you have a column of strings, with a delimiter '|' you can use the following line to split the columns:

如果您有一列字符串,带有分隔符“|” 您可以使用以下行拆分列:

df['left'], df['right'] = df['combined'].str.split('|', 1).str

LeoRochael has a great in-depth explanation of how this works over on a separate thread: https://stackoverflow.com/a/39358924/11688667

LeoRochael 对如何在单独的线程上进行了深入的解释:https://stackoverflow.com/a/39358924/11688667