pandas 访问熊猫系列的索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33541266/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:10:43  来源:igfitidea点击:

Access the index of a pandas series

pythondictionarypandasseries

提问by Dirty_Fox

I am trying to identify which word is the most counted in a pandas dataframe (df_temp in my code). Also I have this :

我试图确定哪个单词在 Pandas 数据帧(我的代码中的 df_temp)中计数最多。我也有这个:

 l = df_temp['word'].count_values()

l is then obviously a pandas series where the first row points toward the most counted index (in my case the most counted word) in df_temp['word']. Although I can see the word in my console, I cannot get it properly. The only way I found so far is to transform it into a dictionnary so I have :

l 那么显然是一个pandas系列,其中第一行指向df_temp['word']中计数最多的索引(在我的例子中是计数最多的词)。虽然我可以在控制台中看到这个词,但我无法正确理解它。到目前为止,我发现的唯一方法是将其转换为字典,因此我有:

dl = dict(l)

and then I can easily retrieve my index...after sorting the dictionnary. Obviously this does the job, but I am pretty sure you have a smarter solution as this one is very dirty and inelegant.

然后我可以轻松地检索我的索引......在对字典进行排序之后。显然这可以完成工作,但我很确定您有一个更聪明的解决方案,因为这个解决方案非常脏且不优雅。

Thanks in advance

提前致谢

回答by Joe T. Boka

Using Pandas you can find the most frequent value in the wordcolumn:

使用 Pandas,您可以找到word列中出现频率最高的值:

df['word'].value_counts().idxmax()

and this code below will give you the count for that value, which is the max count in that column:

下面的代码将为您提供该值的计数,即该列中的最大计数:

df['word'].value_counts().max()

回答by EdChum

The indexof the result of value_counts()are your values:

index结果的value_counts()是你的价值观:

l.index

will give you the values that were counted

会给你计算的值

Example:

例子:

In [163]:
df = pd.DataFrame({'a':['hello','world','python','hello','python','python']})
df

Out[163]:
        a
0   hello
1   world
2  python
3   hello
4  python
5  python

In [165]:    
df['a'].value_counts()

Out[165]:
python    3
hello     2
world     1
Name: a, dtype: int64

In [164]:    
df['a'].value_counts().index

Out[164]:
Index(['python', 'hello', 'world'], dtype='object')

So basically you can get a specific word count by indexing the series:

所以基本上你可以通过索引系列来获得特定的字数:

In [167]:
l = df['a'].value_counts()
l['hello']

Out[167]:
2