如何将 Pandas 数据框中的多列弹出到新的数据框中?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49329569/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you pop multiple columns off a Pandas dataframe, into a new dataframe?
提问by Sean McCarthy
Suppose I have the following:
假设我有以下内容:
df = pd.DataFrame({'a':range(2), 'b':range(2), 'c':range(2), 'd':range(2)})
I'd like to "pop" two columns ('c' and 'd') off the dataframe, into a new dataframe, leaving 'a' and 'b' behind in the original df. The following does not work:
我想从数据帧中“弹出”两列('c' 和 'd'),进入一个新的数据帧,在原始 df 中留下 'a' 和 'b'。以下不起作用:
df2 = df.pop(['c', 'd'])
Here's my error:
这是我的错误:
TypeError: '['c', 'd']' is an invalid key
Does anyone know a quick, classy solution, besides doing the following?
除了执行以下操作外,有没有人知道快速,经典的解决方案?
df2 = df[['c', 'd']]
df3 = df[['a', 'b']]
I know the above code is not thattedious to type, but this is why DataFrame.pop was invented--to save us a step when popping one column off a database.
我知道上面的代码输入起来并不那么乏味,但这就是发明 DataFrame.pop 的原因——在从数据库中弹出一列时为我们节省一个步骤。
回答by cs95
This will have to be a two step process (you cannotget around this, because as rightly mentioned, pop
works for a single column and returns a Series).
这将是一个两步的过程(你可以不解决这个问题,因为正确地提到,pop
适用于单个列,并返回一个系列)。
First, slice df
(step 1), and then drop those columns (step 2).
首先,切片df
(步骤 1),然后删除这些列(步骤 2)。
df2 = df[['c', 'd']].copy()
del df[['c', 'd']] # df.drop(['c', 'd'], axis=1, inplace=True)
And here's the ugly alternative using pd.concat
:
这是使用的丑陋替代方案pd.concat
:
df2 = pd.concat([df.pop(x) for x in ['c', 'd']], 1)
This is still a two step process, but you're doing it in one line.
这仍然是一个两步过程,但您是在一行中完成的。
df
a b
0 0 0
1 1 1
df2
c d
0 0 0
1 1 1
回答by pault
Here's an alternative, but I'm not sure if it's more classy than your original solution:
这是一个替代方案,但我不确定它是否比您的原始解决方案更优雅:
df2 = pd.DataFrame([df.pop(x) for x in ['c', 'd']]).T
df3 = pd.DataFrame([df.pop(x) for x in ['a', 'b']]).T
Output:
输出:
print(df2)
# c d
#0 0 0
#1 1 1
print(df3)
# a b
#0 0 0
#1 1 1