将 pandas 列一分为二
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31737939/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Split pandas column into two
提问by Alexis Eggermont
There are other similar questions, but the difference here is that my dataframe already has a lot of columns, only one of which needs to be split.
还有其他类似的问题,但这里的区别在于我的数据框已经有很多列,只有其中一个需要拆分。
I have a large dataframe(hundreds of columns, millions of rows). I would like to split one of these columns when a character ("|") is found in the string.
我有一个大数据框(数百列,数百万行)。当在字符串中找到字符 ("|") 时,我想拆分这些列之一。
All values have only one "|".
所有值只有一个“|”。
For a fixed length I would do this: df['StateInitial'] = df['state'].str[:2]
对于固定长度,我会这样做: df['StateInitial'] = df['state'].str[:2]
I wish I could replace the 2 by string.index("|"), but how do I call the string?
我希望我可以用 string.index("|") 替换 2,但是如何调用字符串?
回答by santon
How about:
怎么样:
df = pd.DataFrame(['a|b', 'c|d'])
s = df[0].apply(lambda x: x.split('|'))
df['left'] = s.apply(lambda x: x[0])
df['right'] = s.apply(lambda x: x[1])
Output:
输出:
0 left right
0 a|b a b
1 c|d c d
回答by khammel
Here is a one liner that builds on the answer provided by @santon:
这是一个基于@santon 提供的答案的单行代码:
df['left'],df['right'] = zip(*df[0].apply(lambda x: x.split('|')))
>>> df
0 left right
0 a|b a b
1 c|d c d
回答by Alexander
First, set you new column values equal to the old column values.
首先,将新列值设置为等于旧列值。
Next, create a new column with values initially equal to None.
接下来,创建一个新列,其值最初等于 None。
Now, update the new column with valid values of the first.
现在,使用第一列的有效值更新新列。
df['new_col1'] = df['old_col']
df['new_col2'] = None
df['new_col2'].update(df.new_col1.apply(lambda x: x.str.split('|')[1]
if len(x.str.split()) == 2 else None))
回答by Jimmy Le
If you have a column of strings, with a delimiter '|' you can use the following line to split the columns:
如果您有一列字符串,带有分隔符“|” 您可以使用以下行拆分列:
df['left'], df['right'] = df['combined'].str.split('|', 1).str
LeoRochael has a great in-depth explanation of how this works over on a separate thread: https://stackoverflow.com/a/39358924/11688667
LeoRochael 对如何在单独的线程上进行了深入的解释:https://stackoverflow.com/a/39358924/11688667

