Python Pandas DataFrame 的起始索引为 1
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20167930/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
start index at 1 for Pandas DataFrame
提问by Clark Fitzgerald
I need the index to start at 1 rather than 0 when writing a Pandas DataFrame to CSV.
将 Pandas DataFrame 写入 CSV 时,我需要索引从 1 而不是 0 开始。
Here's an example:
下面是一个例子:
In [1]: import pandas as pd
In [2]: result = pd.DataFrame({'Count': [83, 19, 20]})
In [3]: result.to_csv('result.csv', index_label='Event_id')
Which produces the following output:
产生以下输出:
In [4]: !cat result.csv
Event_id,Count
0,83
1,19
2,20
But my desired output is this:
但我想要的输出是这样的:
In [5]: !cat result2.csv
Event_id,Count
1,83
2,19
3,20
I realize that this could be done by adding a sequence of integers shifted by 1 as a column to my data frame, but I'm new to Pandas and I'm wondering if a cleaner way exists.
我意识到这可以通过向我的数据框中添加一列移位 1 的整数序列来完成,但我是 Pandas 的新手,我想知道是否存在更简洁的方法。
采纳答案by alko
Index is an object, and default index starts from 0:
索引是一个对象,默认索引从0:
>>> result.index
Int64Index([0, 1, 2], dtype=int64)
You can shift this index by 1with
你可以1用
>>> result.index += 1
>>> result.index
Int64Index([1, 2, 3], dtype=int64)
回答by TomAugspurger
Just set the index before writing to CSV.
只需在写入 CSV 之前设置索引。
df.index = np.arange(1, len(df))
And then write it normally.
然后正常写。
回答by Dung
source: In Python pandas, start row index from 1 instead of zero without creating additional column
来源:在 Python pandas 中,从 1 开始行索引而不是从 0 开始,而不创建额外的列
Working example:
工作示例:
import pandas as pdas
dframe = pdas.read_csv(open(input_file))
dframe.index = dframe.index + 1
回答by Imran
Another way in one line:
一行中的另一种方式:
df.shift()[1:]
回答by Liu Yu
This worked for me
这对我有用
df.index = np.arange(1, len(df)+1)
回答by Utku
You can use this one:
你可以使用这个:
import pandas as pd
result = pd.DataFrame({'Count': [83, 19, 20]})
result.index += 1
print(result)
or this one, by getting the help of numpylibrary like this:
或者这个,通过numpy像这样获得图书馆的帮助:
import pandas as pd
import numpy as np
result = pd.DataFrame({'Count': [83, 19, 20]})
result.index = np.arange(1, len(result)+1)
print(result)
np.arangewill create a numpy array and return values within a given interval which is (1, len(result)+1)and finally you will assign that array to result.index.
np.arange将创建一个 numpy 数组并返回给定间隔内的值(1, len(result)+1),最后您将该数组分配给result.index.
回答by ivanleoncz
Fork from the original answer, giving some cents:
从原始答案中分叉,给出一些美分:
- if I'm not mistaken, starting from version 0.23, index object is
RangeIndextype
- 如果我没记错的话,从 0.23 版本开始,索引对象是
RangeIndex类型
From the official doc:
来自 官方文档:
RangeIndexis a memory-saving special case ofInt64Indexlimited to representing monotonic ranges. UsingRangeIndexmay in some instances improve computing speed.
RangeIndex是一种节省内存的特殊情况,Int64Index仅限于表示单调范围。RangeIndex在某些情况下使用可以提高计算速度。
In case of a huge index range, that makes sense, using the representation of the index, instead of defining the whole index at once (saving memory).
在一个巨大的索引范围的情况下,这是有道理的,使用索引的表示,而不是一次定义整个索引(节省内存)。
Therefore, an example (using Series, but it applies to DataFrame also):
因此,举个例子(使用 Series,但它也适用于 DataFrame):
>>> import pandas as pd
>>>
>>> countries = ['China', 'India', 'USA']
>>> ds = pd.Series(countries)
>>>
>>>
>>> type(ds.index)
<class 'pandas.core.indexes.range.RangeIndex'>
>>> ds.index
RangeIndex(start=0, stop=3, step=1)
>>>
>>> ds.index += 1
>>>
>>> ds.index
RangeIndex(start=1, stop=4, step=1)
>>>
>>> ds
1 China
2 India
3 USA
dtype: object
>>>
As you can see, the increment of the indexobject, changes the startand stopparameters.
如您所见,index对象的增量改变了start和stop参数。

