Python Pandas - Drop 函数错误(轴中不包含标签)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44931834/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 00:38:03  来源:igfitidea点击:

Pandas - Drop function error (label not contained in axis)

pythonpandas

提问by Abdall

I have a CSV file that is as the following:

我有一个 CSV 文件,如下所示:

index,Avg,Min,Max
Build1,56.19,39.123,60.1039
Build2,57.11,40.102,60.2
Build3,55.1134,35.129404123,60.20121

Based off my question hereI am able to add some relevant information to this csv via this short script:

基于我在这里的问题我可以通过这个简短的脚本向这个 csv 添加一些相关信息:

import pandas as pd

df = pd.read_csv('newdata.csv')
print(df)

df_out = pd.concat([df.set_index('index'),df.set_index('index').agg(['max','min','mean'])]).rename(index={'max':'Max','min':'Min','mean':'Average'}).reset_index()

with open('newdata.csv', 'w') as f:
    df_out.to_csv(f,index=False)

This results in this CSV:

这导致此 CSV:

index,Avg,Min,Max
Build1,56.19,39.123,60.1039
Build2,57.11,40.102,60.2
Build3,55.1134,35.129404123,60.20121
Max,57.11,40.102,60.20121
Min,55.1134,35.129404123,60.1039
Average,56.1378,38.1181347077,60.16837

I would like to now have it so I can update this csv. For example if I ran a new build (build4 for instance) I could add that in and then redo the Max, Min, Average rows. My idea is that I therefore delete the rows with labels Max, Min, Average, add my new row, redo the stats. I believe the code I need is as simple as (just for Max but would have lines for Min and Average as well):

我现在想拥有它,以便我可以更新此 csv。例如,如果我运行了一个新的构建(例如 build4),我可以添加它,然后重做 Max、Min、Average 行。我的想法是,我因此删除带有标签 Max、Min、Average 的行,添加我的新行,重做统计信息。我相信我需要的代码很简单(仅适用于 Max,但也会有 Min 和 Average 的行):

df = pd.read_csv('newdata.csv')
df = df.drop('Max')

However this always results in an ValueError: labels ['Max'] not contained in axis

然而,这总是会导致ValueError: labels ['Max'] not contains in axis

I have created the csv files in sublime text, could this be part of the issue? I have read other SO posts about this and none seem to help my issue.

我已经用 sublime 文本创建了 csv 文件,这可能是问题的一部分吗?我已经阅读了其他关于此的 SO 帖子,但似乎没有一个对我的问题有帮助。

I am unsure if this allowed but here is a download link to my csvjust in case something is wrong with the file itself.

我不确定这是否允许,但这里有一个到我的 csv下载链接,以防万一文件本身有问题。

I would be okay with two possible answers:

我可以接受两个可能的答案:

  1. How to fix this drop issue
  2. How to add more builds and update the statistics (a method without drop)
  1. 如何解决这个掉落问题
  2. 如何添加更多构建和更新统计信息(一种没有删除的方法)

回答by error

You must specify the axis argument. default is axis = 0 which is rows columns is axis = 1.

您必须指定轴参数。默认为轴 = 0,即行列是轴 = 1。

so this should be your code.

所以这应该是你的代码。

df = df.drop('Max',axis=1)

edit: looking at this piece of code:

编辑:看这段代码:

df = pd.read_csv('newdata.csv')
df = df.drop('Max')
df = pd.read_csv('newdata.csv')
df = df.drop('Max')

The code you used does not specify that the first column of the csv file contains the index for the dataframe. Thus pandas creates an index on the fly. This index is purely a numerical one. So your index does not contain "Max".

您使用的代码未指定 csv 文件的第一列包含数据帧的索引。因此,pandas 会动态创建索引。这个指数纯粹是一个数字。所以你的索引不包含“Max”。

try the following:

尝试以下操作:

df = pd.read_csv("newdata.csv",index_col=0)
df = df.drop("Max",axis=0)

This forces pandas to use the first column in the csv file to be used as index. This should mean the code works now.

这会强制 Pandas 使用 csv 文件中的第一列作为索引。这应该意味着代码现在可以工作了。

回答by glegoux

To delete a particular column in pandas; do simply:

删除熊猫中的特定列;简单地做:

del df['Max']