Python Pandas：将“.value_counts”输出转换为数据帧

Question

提问by s900n

Hi I want to get the counts of unique values of the dataframe. count_values implements this however I want to use its output somewhere else. How can I convert .count_values output to a pandas dataframe. here is an example code:

嗨，我想获取数据帧的唯一值的计数。count_values 实现了这一点，但是我想在其他地方使用它的输出。如何将 .count_values 输出转换为 Pandas 数据帧。这是一个示例代码：

import pandas as pd
df = pd.DataFrame({'a':[1, 1, 2, 2, 2]})
value_counts = df['a'].value_counts(dropna=True, sort=True)
print(value_counts)
print(type(value_counts))

output is:

输出是：

2    3
1    2
Name: a, dtype: int64
<class 'pandas.core.series.Series'>

What I need is a dataframe like this:

我需要的是这样的数据框：

unique_values  counts
2              3
1              2

Thank you.

谢谢你。

Answer 1

回答by jezrael

Use rename_axisfor name of column from index and reset_index:

使用rename_axis从索引和列的名称reset_index：

df = df.value_counts().rename_axis('unique_values').reset_index(name='counts')
print (df)
   unique_values  counts
0              2       3
1              1       2

Or if need one column DataFrame use Series.to_frame:

或者如果需要一列 DataFrame 使用Series.to_frame：

df = df.value_counts().rename_axis('unique_values').to_frame('counts')
print (df)
               counts
unique_values        
2                   3
1                   2

Answer 2

回答by WY Hsu

I just run into the same problem, so I provide my thoughts here.

我刚遇到同样的问题，所以我在这里提供我的想法。

Warning

警告

When you deal with the data structure of Pandas, you have to aware of the return type.

当您处理的数据结构时Pandas，您必须了解返回类型。

Another solution here

这里的另一个解决方案

Like @jezrael mentioned before, Pandasdo provide API pd.Series.to_frame.

就像前面提到的@jezrael 一样，Pandas提供 API pd.Series.to_frame。

Step 1

第1步

You can also wrap the pd.Seriesto pd.DataFrameby just doing

您也可以包装pd.Series，以pd.DataFrame通过只是做

df_val_counts = pd.DataFrame(value_counts) # wrap pd.Series to pd.DataFrame

Then, you have a pd.DataFramewith column name 'a', and your first column become the index

然后，您有一个pd.DataFramewith column name 'a'，并且您的第一列成为索引

Input:  print(df_value_counts.index.values)
Output: [2 1]

Input:  print(df_value_counts.columns)
Output: Index(['a'], dtype='object')

Step 2

第2步

What now?

现在怎么办？

If you want to add new column names here, as a pd.DataFrame, you can simply reset the index by the API of reset_index().

如果您想在此处添加新的列名，作为 a pd.DataFrame，您可以通过reset_index()的 API 简单地重置索引。

And then, change the column name by a list by API df.coloumns

然后，通过 API df.coloumns的列表更改列名

df_value_counts = df_value_counts.reset_index()
df_value_counts.columns = ['unique_values', 'counts']

Then, you got what you need

然后，你得到了你需要的东西

Output:

       unique_values    counts
    0              2         3
    1              1         2

Full Answer here

完整答案在这里

import pandas as pd

df = pd.DataFrame({'a':[1, 1, 2, 2, 2]})
value_counts = df['a'].value_counts(dropna=True, sort=True)

# solution here
df_val_counts = pd.DataFrame(value_counts)
df_value_counts = df_value_counts.reset_index()
df_value_counts.columns = ['unique_values', 'counts'] # change column names

Answer 3

回答by Constantino

I'll throw in my hat as well, essentially the same as @wy-hsu solution, but in function format:

我也会提出我的想法，本质上与@wy-hsu 解决方案相同，但采用函数格式：

def value_counts_df(df, col):
    """
    Returns pd.value_counts() as a DataFrame

    Parameters
    ----------
    df : Pandas Dataframe
        Dataframe on which to run value_counts(), must have column `col`.
    col : str
        Name of column in `df` for which to generate counts

    Returns
    -------
    Pandas Dataframe
        Returned dataframe will have a single column named "count" which contains the count_values()
        for each unique value of df[col]. The index name of this dataframe is `col`.

    Example
    -------
    >>> value_counts_df(pd.DataFrame({'a':[1, 1, 2, 2, 2]}), 'a')
       count
    a
    2      3
    1      2
    """
    df = pd.DataFrame(df[col].value_counts())
    df.index.name = col
    df.columns = ['count']
    return df

Python Pandas：将“.value_counts”输出转换为数据帧

提问by s900n

回答by jezrael

回答by WY Hsu

Warning

警告

Another solution here

这里的另一个解决方案

Step 1

第1步

Step 2

第2步

Full Answer here

完整答案在这里

回答by Constantino

相关推荐

最近更新

标签

Python Pandas：将“.value_counts”输出转换为数据帧

提问by s900n

回答by jezrael

回答by WY Hsu

Warning

警告

Another solution here

这里的另一个解决方案

Step 1

第1步

Step 2

第2步

Full Answer here

完整答案在这里

回答by Constantino

相关推荐

如何在 Python 中计算包含字符串的两个列表的 Jaccard 相似度？

Python pandas 数据框中整列的子字符串

Pandas DataFrame to_sql Python

在 Python 中检查数字是否不在范围内

相关推荐

最近更新

标签