Python 根据Pandas中的列名删除多列

Question

提问by Peadar Coyle

I have some data and when I import it I get the following unneeded columns I'm looking for an easy way to delete all of these

我有一些数据，当我导入它时，我得到以下不需要的列我正在寻找一种简单的方法来删除所有这些

   'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
   'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',
   'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',
   'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',
   'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',
   'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',
   'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',
   'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',
   'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',
   'Unnamed: 60'

They are indexed by 0-indexing so I tried something like

它们由 0-indexing 索引，所以我尝试了类似的方法

    df.drop(df.columns[[22, 23, 24, 25, 
    26, 27, 28, 29, 30, 31, 32 ,55]], axis=1, inplace=True)

But this isn't very efficient. I tried writing some for loops but this struck me as bad Pandas behaviour. Hence i ask the question here.

但这不是很有效。我尝试编写一些 for 循环，但这让我觉得 Pandas 的行为很糟糕。因此我在这里问这个问题。

I've seen some examples which are similar (Drop multiple columns pandas) but this doesn't answer my question.

我看过一些类似的例子（Drop multiple columns pandas），但这并没有回答我的问题。

Answer 1

采纳答案by EdChum

I don't know what you mean by inefficient but if you mean in terms of typing it could be easier to just select the cols of interest and assign back to the df:

我不知道你所说的低效是什么意思，但如果你的意思是在打字方面，选择感兴趣的列并分配回 df 会更容易：

df = df[cols_of_interest]

Where cols_of_interestis a list of the columns you care about.

cols_of_interest您关心的列的列表在哪里。

Or you can slice the columns and pass this to drop:

或者您可以切片列并将其传递给drop：

df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)

The call to headjust selects 0 rows as we're only interested in the column names rather than data

调用head只选择 0 行，因为我们只对列名而不是数据感兴趣

update

更新

Another method would be simpler would be to use the boolean mask from str.containsand invert it to mask the columns:

另一种更简单的方法是使用布尔掩码 fromstr.contains并将其反转来掩码列：

In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df

Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []

In [4]:
~df.columns.str.contains('Unnamed:')

Out[4]:
array([ True, False, False,  True], dtype=bool)

In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]

Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []

Answer 2

回答by knightofni

This is probably a good way to do what you want. It will delete all columns that contain 'Unnamed' in their header.

这可能是做你想做的事的好方法。它将删除标题中包含“未命名”的所有列。

for col in df.columns:
    if 'Unnamed' in col:
        del df[col]

Answer 3

回答by Shivgan

The below worked for me:

以下对我有用：

for col in df:
    if 'Unnamed' in col:
        #del df[col]
        print col
        try:
            df.drop(col, axis=1, inplace=True)
        except Exception:
            pass

Answer 4

回答by Philipp Schwarz

The by far the simplest approach is:

迄今为止最简单的方法是：

yourdf.drop(['columnheading1', 'columnheading2'], axis=1, inplace=True)

Answer 5

回答by Peter

You can do this in one line and one go:

您可以一口气完成此操作：

df.drop([col for col in df.columns if "Unnamed" in col], axis=1, inplace=True)

This involves less moving around/copying of the object than the solutions above.

与上述解决方案相比，这涉及更少的对象移动/复制。

Answer 6

回答by sheldonzy

My personal favorite, and easier than the answers I have seen here (for multiple columns):

我个人最喜欢的，比我在这里看到的答案更容易（多列）：

df.drop(df.columns[22:56], axis=1, inplace=True)

Or creating a list for multiple columns.

或者为多列创建一个列表。

col = list(df.columns)[22:56]
df.drop(col, axis=1, inplace=1)

Answer 7

回答by px06

Not sure if this solution has been mentioned anywhere yet but one way to do is is pandas.Index.difference.

不确定这个解决方案是否已经在任何地方提到过，但一种方法是pandas.Index.difference.

>>> df = pd.DataFrame(columns=['A','B','C','D'])
>>> df
Empty DataFrame
Columns: [A, B, C, D]
Index: []
>>> to_remove = ['A','C']
>>> df = df[df.columns.difference(to_remove)]
>>> df
Empty DataFrame
Columns: [B, D]
Index: []

Answer 8

回答by Sarah

df = df[[col for col in df.columns if not ('Unnamed' in col)]]

Answer 9

回答by Maddu Swaroop

You can just pass the column names as a list with specifying the axis as 0 or 1

您可以将列名作为列表传递，并将轴指定为 0 或 1

axis=1: Along the Rows
axis=0: Along the Columns
By default axis=0
data.drop(["Colname1","Colname2","Colname3","Colname4"],axis=1)

轴 = 1：沿行
轴=0：沿列
默认轴=0
data.drop(["Colname1","Colname2","Colname3","Colname4"],axis=1)

Answer 10

回答by Niedson

Simple and Easy.Remove all columns after the 22th.

简单易行。删除 22 日之后的所有列。

df.drop(columns=df.columns[22:]) # love it

Python 根据Pandas中的列名删除多列

提问by Peadar Coyle

采纳答案by EdChum

回答by knightofni

回答by Shivgan

回答by Philipp Schwarz

回答by Peter

回答by sheldonzy

回答by px06

回答by Sarah

回答by Maddu Swaroop

回答by Niedson

相关推荐

最近更新

标签

Python 根据Pandas中的列名删除多列

提问by Peadar Coyle

采纳答案by EdChum

回答by knightofni

回答by Shivgan

回答by Philipp Schwarz

回答by Peter

回答by sheldonzy

回答by px06

回答by Sarah

回答by Maddu Swaroop

回答by Niedson

相关推荐

Python Pygame 字体错误

Python 'pip' 不被识别为内部或外部命令

Python 使用 Tkinter 播放 GIF 动画

Python 使用来自 SQLAlchemy 对象的数据在烧瓶中预填充 WTforms

相关推荐

最近更新

标签