Python 在熊猫中删除列的最佳方法是什么

Question

提问by Mohamed Thasin ah

I am raising this question for my self learning. As far as I know, followings are the different methods to remove columns in pandas dataframe.

我提出这个问题是为了我的自学。据我所知，以下是删除 Pandas 数据框中列的不同方法。

Option - 1:

选项1：

df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
del df['a']

Option - 2:

选项 - 2：

df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
df=df.drop('a',1)

Option - 3:

选项 - 3：

df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
df=df[['b','c']]

What is the best approach among these?
Any other approaches to achieve the same?

其中最好的方法是什么？
还有其他方法可以实现相同的目标吗？

Answer 1

采纳答案by YaOzI

Follow the doc:

按照文档：

DataFrame is a 2-dimensional labeled data structurewith columns of potentially different types.

DataFrame 是一种二维标记数据结构，具有可能不同类型的列。

And pandas.DataFrame.drop:

并且pandas.DataFrame.drop：

Drop specified labelsfrom rows or columns.

从行或列中删除指定的标签。

So, I think we should stick with df.drop. Why? I think the pros are:

所以，我认为我们应该坚持使用df.drop. 为什么？我认为优点是：

It gives us more control of the remove action:

# This will return a NEW DataFrame object, leave the original `df` untouched.
df.drop('a', axis=1)  
# This will modify the `df` inplace. **And return a `None`**.
df.drop('a', axis=1, inplace=True)

It can handle more complicated cases with it's args. E.g. with level, we can handle MultiIndex deletion. And with errors, we can prevent some bugs.
It's a more unified and object oriented way.

它使我们可以更好地控制删除操作：

# This will return a NEW DataFrame object, leave the original `df` untouched.
df.drop('a', axis=1)  
# This will modify the `df` inplace. **And return a `None`**.
df.drop('a', axis=1, inplace=True)

它可以使用 args 处理更复杂的情况。例如level，我们可以处理多索引删除。使用errors，我们可以防止一些错误。
这是一种更加统一和面向对象的方式。

And just like @jezrael noted in his answer:

就像@jezrael 在他的回答中指出的那样：

Option 1: Using key word delis a limited way.

选项 1：使用关键字del是一种有限的方式。

Option 3: And df=df[['b','c']]isn't even a deletion in essence. It first select data by indexingwith []syntax, then unbind the name dfwith the original DataFrame and bind it with the new one (i.e. df[['b','c']]).

选项 3：df=df[['b','c']]本质上甚至不是删除。它首先通过使用[]语法进行索引来选择数据，然后将名称df与原始 DataFrame解除绑定并将其与新的 DataFrame 绑定（即df[['b','c']]）。

Answer 2

回答by razmik

The recommended way to delete a column or row in pandas dataframes is using drop.

在 Pandas 数据框中删除列或行的推荐方法是使用 drop。

To delete a column,

要删除列，

df.drop('column_name', axis=1, inplace=True)

To delete a row,

要删除一行，

df.drop('row_index', axis=0, inplace=True)

You can refer this postto see a detailed conversation about column delete approaches.

您可以参考这篇文章以查看有关列删除方法的详细对话。

Answer 3

回答by aydow

From a speed perspective, option 1 seems to be the best. Obviously, based on the other answers, that doesn't mean it's actually the best option.

从速度的角度来看，选项 1 似乎是最好的。显然，根据其他答案，这并不意味着它实际上是最佳选择。

In [52]: import timeit

In [53]: s1 = """
    ...: import pandas as pd
    ...: df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
    ...: del df['a']
    ...: """

In [54]: s2 = """
    ...: import pandas as pd
    ...: df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
    ...: df=df.drop('a',1)
    ...: """

In [55]: s3 = """
    ...: import pandas as pd
    ...: df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
    ...: df=df[['b','c']]
    ...: """

In [56]: timeit.timeit(stmt=s1, number=100000)
Out[56]: 53.37321400642395

In [57]: timeit.timeit(stmt=s2, number=100000)
Out[57]: 79.68139410018921

In [58]: timeit.timeit(stmt=s3, number=100000)
Out[58]: 76.25269913673401

Answer 4

回答by jezrael

In my opinion the best is use 2. and 3. option, because first has limits - you can remove only one column and cannot use dot notation- del df.a.

在我看来，最好是使用 2. 和 3. 选项，因为第一个有限制 - 您只能删除一列并且不能使用点表示法- del df.a。

3.solution is not deleting, but selecting and piRSquaredcreate nice answer for multiple possible solutions with same idea.

3.solution 不是删除，而是选择和piRSquared为具有相同想法的多个可能解决方案创建了很好的答案。

Python 在熊猫中删除列的最佳方法是什么

提问by Mohamed Thasin ah

采纳答案by YaOzI

回答by razmik

回答by aydow

回答by jezrael

相关推荐

最近更新

标签

Python 在熊猫中删除列的最佳方法是什么

提问by Mohamed Thasin ah

采纳答案by YaOzI

回答by razmik

回答by aydow

回答by jezrael

相关推荐

Python Matplotlib：如何更改双条形图的 figsize

Python Tensorflow：恢复图形和模型，然后对单个图像进行评估

在 Windows 10 for python 3.7 上使用 pip 安装 numpy

Python 无法导入名称包括

相关推荐

最近更新

标签