Python 删除熊猫数据框中的未命名列

Question

提问by muazfaiz

I have a data file from columns A-G like below but when I am reading it with pd.read_csv('data.csv')it prints an extra unnamedcolumn at the end for no reason.

我有一个来自 AG 列的数据文件，如下所示，但是当我用pd.read_csv('data.csv')它阅读它时，它会unnamed在末尾无缘无故地打印一个额外的列。

colA    ColB    colC    colD    colE    colF    colG    Unnamed: 7
44      45      26      26      40      26      46        NaN
47      16      38      47      48      22      37        NaN
19      28      36      18      40      18      46        NaN
50      14      12      33      12      44      23        NaN
39      47      16      42      33      48      38        NaN

I have seen my data file various times but I have no extra data in any other column. How I should remove this extra column while reading ? Thanks

我已经多次看到我的数据文件，但我在任何其他列中都没有额外的数据。我应该如何在阅读时删除这个额外的列？谢谢

Answer 1

回答by MaxU

df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

In [162]: df
Out[162]:
   colA  ColB  colC  colD  colE  colF  colG
0    44    45    26    26    40    26    46
1    47    16    38    47    48    22    37
2    19    28    36    18    40    18    46
3    50    14    12    33    12    44    23
4    39    47    16    42    33    48    38

if the first column in the CSV file has index values, then you can do this instead:

如果 CSV 文件中的第一列具有索引值，则您可以改为执行以下操作：

df = pd.read_csv('data.csv', index_col=0)

Answer 2

回答by Adil Warsi

First, find the columns that have 'unnamed', then drop those columns. Note: You should Add inplace = Trueto the .dropparameters as well.

首先，找到具有“未命名”的列，然后删除这些列。注意：您也应该添加inplace = True到.drop参数中。

df.drop(df.columns[df.columns.str.contains('unnamed',case = False)],axis = 1, inplace = True)

Answer 3

回答by Susan

The pandas.DataFrame.dropnafunction removes missing values(e.g. NaN, NaT).

该pandas.DataFrame.dropna函数删除缺失值（例如NaN，NaT）。

For example the following code would remove any columns from your dataframe, where all of the elements of that column are missing.

例如，以下代码将从您的数据框中删除任何列，其中缺少该列的所有元素。

df.dropna(how='all', axis='columns')

Answer 4

回答by Ezarate11

The approved solution doesn't work in my case, so my solution is the following one:

已批准的解决方案在我的情况下不起作用，因此我的解决方案如下：

    ''' The column name in the example case is "Unnamed: 7"
 but it works with any other name ("Unnamed: 0" for example). '''

        df.rename({"Unnamed: 7":"a"}, axis="columns", inplace=True)

        # Then, drop the column as usual.

        df.drop(["a"], axis=1, inplace=True)

Hope it helps others.

希望它可以帮助其他人。

Python 删除熊猫数据框中的未命名列

提问by muazfaiz

回答by MaxU

回答by Adil Warsi

回答by Susan

回答by Ezarate11

相关推荐

最近更新

标签

Python 删除熊猫数据框中的未命名列

提问by muazfaiz

回答by MaxU

回答by Adil Warsi

回答by Susan

回答by Ezarate11

相关推荐

python请求http响应500（可以在浏览器中访问站点）

Python 无法加载本机 TensorFlow 运行时。蟒蛇 3.5.2

Python 如何使用列表切片从列表中获取除第一个元素之外的所有内容

Python 使用 Qt Designer 表单和 PyQt5 在 QWidget 中绘制 matplotlib 图

相关推荐

最近更新

标签