pandas 创建计数的熊猫数据框

Question

提问by Tchotchke

I want to create a pandas dataframe with two columns, the first being the unique values of one of my columns and the second being the count of unique values.

我想创建一个包含两列的 Pandas 数据框，第一列是其中一列的唯一值，第二列是唯一值的计数。

I have seen many posts (such here) as that describe how to get the counts, but the issue I'm running into is when I try to create a dataframe the column values become my index.

我看过很多帖子（例如here）描述如何获取计数，但我遇到的问题是当我尝试创建数据框时，列值成为我的索引。

Sample data: df = pd.DataFrame({'Color': ['Red', 'Red', 'Blue'], 'State': ['MA', 'PA', 'PA']}). I want to end up with a dataframe like:

样本数据：df = pd.DataFrame({'Color': ['Red', 'Red', 'Blue'], 'State': ['MA', 'PA', 'PA']})。我想最终得到一个数据框，如：

   Color Count
0   Red  2
1  Blue  1

I have tried the following, but in all cases the index ends up as Color and the Count is the only column in the dataframe.

我尝试了以下方法，但在所有情况下，索引都以 Color 结束，而 Count 是数据框中唯一的列。

Attempt 1:

尝试 1：

df2 = pd.DataFrame(data=df['Color'].value_counts())
# And resetting the index just gets rid of Color, which I want to keep
df2 = df2.reset_index(drop=True)

Attempt 2:

尝试 2：

df3 = df['Color'].value_counts()
df3 = pd.DataFrame(data=df3, index=range(df3.shape[0]))

Attempt 3:

尝试 3：

df4 = df.groupby('Color')
df4 = pd.DataFrame(df4['Color'].count())

Answer 1

回答by Phillip Cloud

Another way to do this, using value_counts:

另一种方法是使用value_counts：

In [10]: df = pd.DataFrame({'Color': ['Red', 'Red', 'Blue'], 'State': ['MA', 'PA', 'PA']})

In [11]: df.Color.value_counts().reset_index().rename(columns={'index': 'Color', 0: 'count'})
Out[11]:
  Color  count
0   Red      2
1  Blue      1

Answer 2

回答by mdurant

Essentially equivalent to setting the column names, but using the rename method instead:

本质上等同于设置列名，但使用重命名方法：

df.groupby('Color').count().reset_index().rename(columns={'State': 'Count'})

Answer 3

回答by jpp

One readable solution is to use to_frameand rename_axismethods:

一种可读的解决方案是使用to_frame和rename_axis方法：

res = df['Color'].value_counts()\
                 .to_frame('count').rename_axis('Color')\
                 .reset_index()

print(res)

  Color  count
0   Red      2
1  Blue      1

Answer 4

回答by khammel

df=df.groupby('Color').count().reset_index()
df.columns=['Color','Count']

Answer 5

回答by letterjung

label_sentiment=[]
for i in range(len(score)):
   if score[i]==0:
       label_sentiment.append('NEUTRAL')
   elif score[i]>0:
       label_sentiment.append('POSITIVE')
   elif score[i]<0:
       label_sentiment.append('NEGATIVE')
data['label_sentiment']=label_sentiment

# #pythonT

pandas 创建计数的熊猫数据框

提问by Tchotchke

回答by Phillip Cloud

回答by mdurant

回答by jpp

回答by khammel

回答by letterjung

相关推荐

最近更新

标签

pandas 创建计数的熊猫数据框

提问by Tchotchke

回答by Phillip Cloud

回答by mdurant

回答by jpp

回答by khammel

回答by letterjung

相关推荐

如何将数据附加到 Pandas 多索引数据帧

使用 pandas 数据框进行 rpy2 回归的最小示例

pandas 数据框按 nan 的数量删除列

Pandas.read_csv 将所有文件读入一列

相关推荐

最近更新

标签