Python 删除熊猫数据框中的未命名列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43983622/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove Unnamed columns in pandas dataframe
提问by muazfaiz
I have a data file from columns A-G like below but when I am reading it with pd.read_csv('data.csv')
it prints an extra unnamed
column at the end for no reason.
我有一个来自 AG 列的数据文件,如下所示,但是当我用pd.read_csv('data.csv')
它阅读它时,它会unnamed
在末尾无缘无故地打印一个额外的列。
colA ColB colC colD colE colF colG Unnamed: 7
44 45 26 26 40 26 46 NaN
47 16 38 47 48 22 37 NaN
19 28 36 18 40 18 46 NaN
50 14 12 33 12 44 23 NaN
39 47 16 42 33 48 38 NaN
I have seen my data file various times but I have no extra data in any other column. How I should remove this extra column while reading ? Thanks
我已经多次看到我的数据文件,但我在任何其他列中都没有额外的数据。我应该如何在阅读时删除这个额外的列?谢谢
回答by MaxU
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]
In [162]: df
Out[162]:
colA ColB colC colD colE colF colG
0 44 45 26 26 40 26 46
1 47 16 38 47 48 22 37
2 19 28 36 18 40 18 46
3 50 14 12 33 12 44 23
4 39 47 16 42 33 48 38
if the first column in the CSV file has index values, then you can do this instead:
如果 CSV 文件中的第一列具有索引值,则您可以改为执行以下操作:
df = pd.read_csv('data.csv', index_col=0)
回答by Adil Warsi
First, find the columns that have 'unnamed', then drop those columns. Note: You should Add inplace = True
to the .drop
parameters as well.
首先,找到具有“未命名”的列,然后删除这些列。注意:您也应该添加inplace = True
到.drop
参数中。
df.drop(df.columns[df.columns.str.contains('unnamed',case = False)],axis = 1, inplace = True)
回答by Susan
The pandas.DataFrame.dropna
function removes missing values(e.g. NaN
, NaT
).
该pandas.DataFrame.dropna
函数删除缺失值(例如NaN
,NaT
)。
For example the following code would remove any columns from your dataframe, where all of the elements of that column are missing.
例如,以下代码将从您的数据框中删除任何列,其中缺少该列的所有元素。
df.dropna(how='all', axis='columns')
回答by Ezarate11
The approved solution doesn't work in my case, so my solution is the following one:
已批准的解决方案在我的情况下不起作用,因此我的解决方案如下:
''' The column name in the example case is "Unnamed: 7"
but it works with any other name ("Unnamed: 0" for example). '''
df.rename({"Unnamed: 7":"a"}, axis="columns", inplace=True)
# Then, drop the column as usual.
df.drop(["a"], axis=1, inplace=True)
Hope it helps others.
希望它可以帮助其他人。