Python Pandas - 在一个命令中从数据框中删除多个系列

Question

提问by Grant M.

In short ... I have a Python Pandas data frame that is read in from an Excel file using 'read_table'. I would like to keep a handful of the series from the data, and purge the rest. I know that I can just delete what I don't want one-by-one using 'del data['SeriesName']', but what I'd rather do is specify what to keep instead of specifying what to delete.

简而言之......我有一个 Python Pandas 数据框，它是使用“read_table”从 Excel 文件中读取的。我想从数据中保留一些系列，并清除其余的。我知道我可以使用 'del data['SeriesName']' 逐一删除我不想要的内容，但我宁愿做的是指定要保留的内容而不是指定要删除的内容。

If the simplest answer is to copy the existing data frame into a new data frame that only contains the series I want, and then delete the existing frame in its entirety, I would satisfied with that solution ... but if that is indeed the best way, can someone walk me through it?

如果最简单的答案是将现有数据框复制到仅包含我想要的系列的新数据框，然后完全删除现有框，我会对该解决方案感到满意......但如果这确实是最好的方式，有人可以引导我通过它吗？

TIA ... I'm a newb to Pandas. :)

TIA ......我是 Pandas 的新手。:)

Answer 1

回答by Zelazny7

You can use the DataFramedropfunction to remove columns. You have to pass the axis=1option for it to work on columns and not rows. Note that it returns a copy so you have to assign the result to a new DataFrame:

您可以使用该DataFramedrop函数删除列。您必须传递axis=1选项才能处理列而不是行。请注意，它返回一个副本，因此您必须将结果分配给一个新的DataFrame：

In [1]: from pandas import *

In [2]: df = DataFrame(dict(x=[0,0,1,0,1], y=[1,0,1,1,0], z=[0,0,1,0,1]))

In [3]: df
Out[3]:
   x  y  z
0  0  1  0
1  0  0  0
2  1  1  1
3  0  1  0
4  1  0  1

In [4]: df = df.drop(['x','y'], axis=1)

In [5]: df
Out[5]:
   z
0  0
1  0
2  1
3  0
4  1

Answer 2

回答by Theodros Zelleke

Basically the same as Zelazny7's answer -- just specifying what to keep:

与 Zelazny7 的回答基本相同——只是指定要保留的内容：

In [68]: df
Out[68]: 
   x  y  z
0  0  1  0
1  0  0  0
2  1  1  1
3  0  1  0
4  1  0  1

In [70]: df = df[['x','z']]                                                                

In [71]: df
Out[71]: 
   x  z
0  0  0
1  0  0
2  1  1
3  0  0
4  1  1

Edit

编辑

You can specify a large number of columns through indexing/slicing into the Dataframe.columnsobject.
This object of type(pandas.Index)can be viewed as a dictof column labels (with some extended functionality).

您可以通过对Dataframe.columns对象进行索引/切片来指定大量列。
这个对象type(pandas.Index)可以被视为一个dict列标签（具有一些扩展功能）。

See this extension of above examples:

请参阅上述示例的扩展：

In [4]: df.columns
Out[4]: Index([x, y, z], dtype=object)

In [5]: df[df.columns[1:]]
Out[5]: 
   y  z
0  1  0
1  0  0
2  1  1
3  1  0
4  0  1

In [7]: df.drop(df.columns[1:], axis=1)
Out[7]: 
   x
0  0
1  0
2  1
3  0
4  1

Answer 3

回答by oW_

You can also specify a list of columns to keep with the usecolsoption in pandas.read_table. This speeds up the loading process as well.

您还可以指定要与中的usecols选项保持一致的列列表pandas.read_table。这也加快了加载过程。

Python Pandas - 在一个命令中从数据框中删除多个系列

提问by Grant M.

回答by Zelazny7

回答by Theodros Zelleke

Edit

编辑

回答by oW_

相关推荐

最近更新

标签

Python Pandas - 在一个命令中从数据框中删除多个系列

提问by Grant M.

回答by Zelazny7

回答by Theodros Zelleke

*Edit*

*编辑*

回答by oW_

相关推荐

我怎么知道我的python脚本挂在哪里？

Python 在 Flask 中重定向到 URL

Python 多个轴的单个图例

Python 合并 PDF 文件

相关推荐

最近更新

标签

Edit

编辑