Python Pandas DataFrame 的起始索引为 1

Question

提问by Clark Fitzgerald

I need the index to start at 1 rather than 0 when writing a Pandas DataFrame to CSV.

将 Pandas DataFrame 写入 CSV 时，我需要索引从 1 而不是 0 开始。

Here's an example:

下面是一个例子：

In [1]: import pandas as pd

In [2]: result = pd.DataFrame({'Count': [83, 19, 20]})

In [3]: result.to_csv('result.csv', index_label='Event_id')

Which produces the following output:

产生以下输出：

In [4]: !cat result.csv
Event_id,Count
0,83
1,19
2,20

But my desired output is this:

但我想要的输出是这样的：

In [5]: !cat result2.csv
Event_id,Count
1,83
2,19
3,20

I realize that this could be done by adding a sequence of integers shifted by 1 as a column to my data frame, but I'm new to Pandas and I'm wondering if a cleaner way exists.

我意识到这可以通过向我的数据框中添加一列移位 1 的整数序列来完成，但我是 Pandas 的新手，我想知道是否存在更简洁的方法。

Answer 1

采纳答案by alko

Index is an object, and default index starts from 0:

索引是一个对象，默认索引从0：

>>> result.index
Int64Index([0, 1, 2], dtype=int64)

You can shift this index by 1with

你可以1用

>>> result.index += 1 
>>> result.index
Int64Index([1, 2, 3], dtype=int64)

Answer 2

回答by TomAugspurger

Just set the index before writing to CSV.

只需在写入 CSV 之前设置索引。

df.index = np.arange(1, len(df))

And then write it normally.

然后正常写。

Answer 3

回答by Dung

source: In Python pandas, start row index from 1 instead of zero without creating additional column

来源：在 Python pandas 中，从 1 开始行索引而不是从 0 开始，而不创建额外的列

Working example:

工作示例：

import pandas as pdas
dframe = pdas.read_csv(open(input_file))
dframe.index = dframe.index + 1

Answer 4

回答by Imran

Another way in one line:

一行中的另一种方式：

df.shift()[1:]

Answer 5

回答by Liu Yu

This worked for me

这对我有用

 df.index = np.arange(1, len(df)+1)

Answer 6

回答by Utku

You can use this one:

你可以使用这个：

import pandas as pd

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index += 1
print(result)

or this one, by getting the help of numpylibrary like this:

或者这个，通过numpy像这样获得图书馆的帮助：

import pandas as pd
import numpy as np

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index = np.arange(1, len(result)+1)
print(result)

np.arangewill create a numpy array and return values within a given interval which is (1, len(result)+1)and finally you will assign that array to result.index.

np.arange将创建一个 numpy 数组并返回给定间隔内的值(1, len(result)+1)，最后您将该数组分配给result.index.

Answer 7

回答by ivanleoncz

Fork from the original answer, giving some cents:

从原始答案中分叉，给出一些美分：

if I'm not mistaken, starting from version 0.23, index object is RangeIndextype

如果我没记错的话，从 0.23 版本开始，索引对象是RangeIndex类型

From the official doc:

来自官方文档：

RangeIndexis a memory-saving special case of Int64Indexlimited to representing monotonic ranges. Using RangeIndexmay in some instances improve computing speed.

RangeIndex是一种节省内存的特殊情况，Int64Index仅限于表示单调范围。RangeIndex在某些情况下使用可以提高计算速度。

In case of a huge index range, that makes sense, using the representation of the index, instead of defining the whole index at once (saving memory).

在一个巨大的索引范围的情况下，这是有道理的，使用索引的表示，而不是一次定义整个索引（节省内存）。

Therefore, an example (using Series, but it applies to DataFrame also):

因此，举个例子（使用 Series，但它也适用于 DataFrame）：

>>> import pandas as pd
>>> 
>>> countries = ['China', 'India', 'USA']
>>> ds = pd.Series(countries)
>>> 
>>>
>>> type(ds.index)
<class 'pandas.core.indexes.range.RangeIndex'>
>>> ds.index
RangeIndex(start=0, stop=3, step=1)
>>> 
>>> ds.index += 1
>>> 
>>> ds.index
RangeIndex(start=1, stop=4, step=1)
>>> 
>>> ds
1    China
2    India
3      USA
dtype: object
>>>

As you can see, the increment of the indexobject, changes the startand stopparameters.

如您所见，index对象的增量改变了start和stop参数。

Python Pandas DataFrame 的起始索引为 1

提问by Clark Fitzgerald

采纳答案by alko

回答by TomAugspurger

回答by Dung

回答by Imran

回答by Liu Yu

回答by Utku

回答by ivanleoncz

相关推荐

最近更新

标签

Python Pandas DataFrame 的起始索引为 1

提问by Clark Fitzgerald

采纳答案by alko

回答by TomAugspurger

回答by Dung

回答by Imran

回答by Liu Yu

回答by Utku

回答by ivanleoncz

相关推荐

如何从终端为python脚本获取输入文件？

Python “subprocess.Popen” - 检查成功和错误

在 Python 中打印多个参数

Python 将函数应用于列表的每个元素

相关推荐

最近更新

标签