pandas 访问熊猫系列的索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33541266/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Access the index of a pandas series
提问by Dirty_Fox
I am trying to identify which word is the most counted in a pandas dataframe (df_temp in my code). Also I have this :
我试图确定哪个单词在 Pandas 数据帧(我的代码中的 df_temp)中计数最多。我也有这个:
l = df_temp['word'].count_values()
l is then obviously a pandas series where the first row points toward the most counted index (in my case the most counted word) in df_temp['word']. Although I can see the word in my console, I cannot get it properly. The only way I found so far is to transform it into a dictionnary so I have :
l 那么显然是一个pandas系列,其中第一行指向df_temp['word']中计数最多的索引(在我的例子中是计数最多的词)。虽然我可以在控制台中看到这个词,但我无法正确理解它。到目前为止,我发现的唯一方法是将其转换为字典,因此我有:
dl = dict(l)
and then I can easily retrieve my index...after sorting the dictionnary. Obviously this does the job, but I am pretty sure you have a smarter solution as this one is very dirty and inelegant.
然后我可以轻松地检索我的索引......在对字典进行排序之后。显然这可以完成工作,但我很确定您有一个更聪明的解决方案,因为这个解决方案非常脏且不优雅。
Thanks in advance
提前致谢
回答by Joe T. Boka
Using Pandas you can find the most frequent value in the word
column:
使用 Pandas,您可以找到word
列中出现频率最高的值:
df['word'].value_counts().idxmax()
and this code below will give you the count for that value, which is the max count in that column:
下面的代码将为您提供该值的计数,即该列中的最大计数:
df['word'].value_counts().max()
回答by EdChum
The index
of the result of value_counts()
are your values:
该index
结果的value_counts()
是你的价值观:
l.index
will give you the values that were counted
会给你计算的值
Example:
例子:
In [163]:
df = pd.DataFrame({'a':['hello','world','python','hello','python','python']})
df
Out[163]:
a
0 hello
1 world
2 python
3 hello
4 python
5 python
In [165]:
df['a'].value_counts()
Out[165]:
python 3
hello 2
world 1
Name: a, dtype: int64
In [164]:
df['a'].value_counts().index
Out[164]:
Index(['python', 'hello', 'world'], dtype='object')
So basically you can get a specific word count by indexing the series:
所以基本上你可以通过索引系列来获得特定的字数:
In [167]:
l = df['a'].value_counts()
l['hello']
Out[167]:
2