Python Pandas DataFrame 的起始索引为 1

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20167930/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 19:44:24  来源:igfitidea点击:

start index at 1 for Pandas DataFrame

pythonpandascsvdataframeindexing

提问by Clark Fitzgerald

I need the index to start at 1 rather than 0 when writing a Pandas DataFrame to CSV.

将 Pandas DataFrame 写入 CSV 时,我需要索引从 1 而不是 0 开始。

Here's an example:

下面是一个例子:

In [1]: import pandas as pd

In [2]: result = pd.DataFrame({'Count': [83, 19, 20]})

In [3]: result.to_csv('result.csv', index_label='Event_id')                               

Which produces the following output:

产生以下输出:

In [4]: !cat result.csv
Event_id,Count
0,83
1,19
2,20

But my desired output is this:

但我想要的输出是这样的:

In [5]: !cat result2.csv
Event_id,Count
1,83
2,19
3,20

I realize that this could be done by adding a sequence of integers shifted by 1 as a column to my data frame, but I'm new to Pandas and I'm wondering if a cleaner way exists.

我意识到这可以通过向我的数据框中添加一列移位 1 的整数序列来完成,但我是 Pandas 的新手,我想知道是否存在更简洁的方法。

采纳答案by alko

Index is an object, and default index starts from 0:

索引是一个对象,默认索引从0

>>> result.index
Int64Index([0, 1, 2], dtype=int64)

You can shift this index by 1with

你可以1

>>> result.index += 1 
>>> result.index
Int64Index([1, 2, 3], dtype=int64)

回答by TomAugspurger

Just set the index before writing to CSV.

只需在写入 CSV 之前设置索引。

df.index = np.arange(1, len(df))

And then write it normally.

然后正常写。

回答by Dung

source: In Python pandas, start row index from 1 instead of zero without creating additional column

来源:在 Python pandas 中,从 1 开始行索引而不是从 0 开始,而不创建额外的列

Working example:

工作示例:

import pandas as pdas
dframe = pdas.read_csv(open(input_file))
dframe.index = dframe.index + 1

回答by Imran

Another way in one line:

一行中的另一种方式:

df.shift()[1:]

回答by Liu Yu

This worked for me

这对我有用

 df.index = np.arange(1, len(df)+1)

回答by Utku

You can use this one:

你可以使用这个:

import pandas as pd

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index += 1
print(result)

or this one, by getting the help of numpylibrary like this:

或者这个,通过numpy像这样获得图书馆的帮助:

import pandas as pd
import numpy as np

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index = np.arange(1, len(result)+1)
print(result)

np.arangewill create a numpy array and return values within a given interval which is (1, len(result)+1)and finally you will assign that array to result.index.

np.arange将创建一个 numpy 数组并返回给定间隔内的值(1, len(result)+1),最后您将该数组分配给result.index.

回答by ivanleoncz

Fork from the original answer, giving some cents:

从原始答案中分叉,给出一些美分:

  • if I'm not mistaken, starting from version 0.23, index object is RangeIndextype
  • 如果我没记错的话,从 0.23 版本开始,索引对象是RangeIndex类型

From the official doc:

来自 官方文档

RangeIndexis a memory-saving special case of Int64Indexlimited to representing monotonic ranges. Using RangeIndexmay in some instances improve computing speed.

RangeIndex是一种节省内存的特殊情况,Int64Index仅限于表示单调范围。RangeIndex在某些情况下使用可以提高计算速度

In case of a huge index range, that makes sense, using the representation of the index, instead of defining the whole index at once (saving memory).

在一个巨大的索引范围的情况下,这是有道理的,使用索引的表示,而不是一次定义整个索引(节省内存)。

Therefore, an example (using Series, but it applies to DataFrame also):

因此,举个例子(使用 Series,但它也适用于 DataFrame):

>>> import pandas as pd
>>> 
>>> countries = ['China', 'India', 'USA']
>>> ds = pd.Series(countries)
>>> 
>>>
>>> type(ds.index)
<class 'pandas.core.indexes.range.RangeIndex'>
>>> ds.index
RangeIndex(start=0, stop=3, step=1)
>>> 
>>> ds.index += 1
>>> 
>>> ds.index
RangeIndex(start=1, stop=4, step=1)
>>> 
>>> ds
1    China
2    India
3      USA
dtype: object
>>> 

As you can see, the increment of the indexobject, changes the startand stopparameters.

如您所见,index对象的增量改变了startstop参数。