Python 读取csv时删除pandas中的索引列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20107570/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing index column in pandas when reading a csv
提问by Bogdan Janiszewski
I have the following code which imports a CSV file. There are 3 columns and I want to set the first two of them to variables. When I set the second column to the variable "efficiency" the index column is also tacked on. How can I get rid of the index column?
我有以下代码可以导入 CSV 文件。有 3 列,我想将其中的前两列设置为变量。当我将第二列设置为变量“效率”时,索引列也被添加。如何摆脱索引列?
df = pd.DataFrame.from_csv('Efficiency_Data.csv', header=0, parse_dates=False)
energy = df.index
efficiency = df.Efficiency
print efficiency
I tried using
我尝试使用
del df['index']
after I set
我设置后
energy = df.index
which I found in another post but that results in "KeyError: 'index' "
我在另一篇文章中找到了它,但结果是“KeyError:'index'”
采纳答案by Dan Allan
DataFramesand Seriesalways have an index. Although it displays alongside the column(s), it is not a column, which is why del df['index']did not work.
DataFrames和Series总是有一个索引。尽管它显示在列旁边,但它不是一列,这就是它不起作用的原因del df['index']。
If you want to replace the index with simple sequential numbers, use df.reset_index().
如果您想用简单的序列号替换索引,请使用df.reset_index().
To get a sense for why the index is there and how it is used, see e.g. 10 minutes to Pandas.
要了解索引为何存在以及如何使用它,请参阅例如10 分钟到 Pandas。
回答by yemu
you can specify which column is an index in your csv file by using index_col parameter of from_csv function if this doesn't solve you problem please provide example of your data
您可以使用 from_csv 函数的 index_col 参数指定 csv 文件中哪一列是索引,如果这不能解决您的问题,请提供您的数据示例
回答by Bhanu Pratap Singh
If your problem is same as mine where you just want to reset the column headers from 0 to column size. Do
如果您的问题与我的问题相同,您只想将列标题从 0 重置为列大小。做
df = pd.DataFrame(df.values);
EDIT:
编辑:
Not a good idea if you have heterogenous data types. Better just use
如果您有异构数据类型,这不是一个好主意。更好地使用
df.columns = range(len(df.columns))
回答by Steve
When reading to and from your CSV file include the argument index=Falseso for example:
在读取和读取 CSV 文件时包括参数index=False,例如:
df.to_csv(filename, index=False)
and to read from the csv
并从 csv 中读取
df.read_csv(filename, index=False)
This should prevent the issue so you don't need to fix it later.
这应该可以防止出现问题,因此您以后不需要修复它。
回答by Natheer Alabsi
You can set one of the columns as an index in case it is an "id" for example. In this case the index column will be replaced by one of the columns you have chosen.
例如,您可以将其中一列设置为索引,以防它是“id”。在这种情况下,索引列将被您选择的列之一替换。
df.set_index('id', inplace=True)
回答by Subhojit Mukherjee
df.reset_index(drop=True, inplace=True)
df.reset_index(drop=True, inplace=True)
回答by Lord Varis
One thing that i do is df=df.reset_index()then df=df.drop(['index'],axis=1)
一两件事,我做的是df=df.reset_index()那么df=df.drop(['index'],axis=1)

