如何将 Pandas 数据框中的多列弹出到新的数据框中？

Question

提问by Sean McCarthy

Suppose I have the following:

假设我有以下内容：

df = pd.DataFrame({'a':range(2), 'b':range(2), 'c':range(2), 'd':range(2)})

I'd like to "pop" two columns ('c' and 'd') off the dataframe, into a new dataframe, leaving 'a' and 'b' behind in the original df. The following does not work:

我想从数据帧中“弹出”两列（'c' 和 'd'），进入一个新的数据帧，在原始 df 中留下 'a' 和 'b'。以下不起作用：

df2 = df.pop(['c', 'd'])

Here's my error:

这是我的错误：

TypeError: '['c', 'd']' is an invalid key

Does anyone know a quick, classy solution, besides doing the following?

除了执行以下操作外，有没有人知道快速，经典的解决方案？

df2 = df[['c', 'd']]
df3 = df[['a', 'b']]

I know the above code is not thattedious to type, but this is why DataFrame.pop was invented--to save us a step when popping one column off a database.

我知道上面的代码输入起来并不那么乏味，但这就是发明 DataFrame.pop 的原因——在从数据库中弹出一列时为我们节省一个步骤。

Answer 1

回答by cs95

This will have to be a two step process (you cannotget around this, because as rightly mentioned, popworks for a single column and returns a Series).

这将是一个两步的过程（你可以不解决这个问题，因为正确地提到，pop适用于单个列，并返回一个系列）。

First, slice df(step 1), and then drop those columns (step 2).

首先，切片df（步骤 1），然后删除这些列（步骤 2）。

df2 = df[['c', 'd']].copy()
del df[['c', 'd']] # df.drop(['c', 'd'], axis=1, inplace=True)

And here's the ugly alternative using pd.concat:

这是使用的丑陋替代方案pd.concat：

df2 = pd.concat([df.pop(x) for x in ['c', 'd']], 1)

This is still a two step process, but you're doing it in one line.

这仍然是一个两步过程，但您是在一行中完成的。

Answer 2

回答by pault

Here's an alternative, but I'm not sure if it's more classy than your original solution:

这是一个替代方案，但我不确定它是否比您的原始解决方案更优雅：

df2 = pd.DataFrame([df.pop(x) for x in ['c', 'd']]).T
df3 = pd.DataFrame([df.pop(x) for x in ['a', 'b']]).T

Output:

输出：

print(df2)
#   c  d
#0  0  0
#1  1  1

print(df3)
#   a  b
#0  0  0
#1  1  1

如何将 Pandas 数据框中的多列弹出到新的数据框中？

提问by Sean McCarthy

回答by cs95

回答by pault

相关推荐

最近更新

标签

如何将 Pandas 数据框中的多列弹出到新的数据框中？

提问by Sean McCarthy

回答by cs95

回答by pault

相关推荐

pandas 将对象转换为大型数据帧的 int

Pandas：更改数据帧日期索引格式

pandas TypeError: 'Series' 对象在访问数据帧的 dtypes 时不可调用

pandas 如何在熊猫中用滚动平均值填充nan值

相关推荐

最近更新

标签