像访问常规列一样访问 Pandas 索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/52139506/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:59:57  来源:igfitidea点击:

Accessing a Pandas index like a regular column

pythonpandasdataframeindexingseries

提问by kuzzooroo

I have a Pandas DataFrame with a named index. I want to pass it off to a piece off code that takes a DataFrame, a column name, and some other stuff, and does a bunch of work involving that column. Only in this case the column I want to highlight is the index, but giving the index's label to this piece of code doesn't work because you can't extract an index like you can a regular column. For example, I can construct a DataFrame like this:

我有一个带有命名索引的 Pandas DataFrame。我想把它传递给一段代码,它接受一个数据帧、一个列名和其他一些东西,并做一些涉及该列的工作。仅在这种情况下,我要突出显示的列是索引,但是将索引的标签赋予这段代码不起作用,因为您无法像提取常规列那样提取索引。例如,我可以像这样构造一个 DataFrame:

import pandas as pd, numpy as np

df=pd.DataFrame({'name':map(chr, range(97, 102)), 'id':range(10000,10005), 'value':np.random.randn(5)})
df.set_index('name', inplace=True)

Here's the result:

结果如下:

         id     value
name                 
a     10000  0.659710
b     10001  1.001821
c     10002 -0.197576
d     10003 -0.569181
e     10004 -0.882097

Now how am I allowed to go about accessing the namecolumn?

现在我如何才能访问该name列?

print(df.index)  # No problem
print(df['name'])  # KeyError: u'name'

I know there are workaround like duplicating the column or changing the index to something else. But is there something cleaner, like some form of column access that treats the index the same way as everything else?

我知道有一些解决方法,例如复制列或将索引更改为其他内容。但是有没有更干净的东西,比如某种形式的列访问,可以像对待其他所有东西一样对待索引?

回答by jpp

Index has a special meaning in Pandas. It's used to optimise specific operations and can be used in various methods such as merging / joining data. Therefore, make a choice:

索引在 Pandas 中有特殊的意义。它用于优化特定操作,可用于合并/加入数据等各种方法。因此,做出选择:

  • If it's "just another column", use reset_indexand treat it as another column.
  • If it's genuinely used for indexing, keep it as an index and use df.index.
  • 如果它是“只是另一列”,reset_index请将其用作另一列。
  • 如果它真正用于索引,请将其保留为索引并使用df.index.

We can't make this choice for you. It should be dependent on the structure of your underlying data and on how you intend to analyse your data.

我们不能为你做这个选择。它应该取决于基础数据的结构以及您打算如何分析数据。

For more information on use of a dataframe index, see:

有关使用数据帧索引的更多信息,请参阅:

回答by Ian Ash

Instead of using reset_index, you could just copy the index to a normal column, do some work and then drop the column, for example:

reset_index您可以不使用,而是将索引复制到普通列,做一些工作然后删除该列,例如:

df['tmp'] = df.index
# do stuff based on df['tmp']
del df['tmp']