Python 将 Counter 对象转换为 Pandas DataFrame

Question

提问by woshitom

I used Counteron a list to compute this variable:

我用Counter一个列表来计算这个变量：

final = Counter(event_container)

print final gives:

打印最终给出：

Counter({'fb_view_listing': 76, 'fb_homescreen': 63, 'rt_view_listing': 50, 'rt_home_start_app': 46, 'fb_view_wishlist': 39, 'fb_view_product': 37, 'fb_search': 29, 'rt_view_product': 23, 'fb_view_cart': 22, 'rt_search': 12, 'rt_view_cart': 12, 'add_to_cart': 2, 'create_campaign': 1, 'fb_connect': 1, 'sale': 1, 'guest_sale': 1, 'remove_from_cart': 1, 'rt_transaction_confirmation': 1, 'login': 1})

Now I want to convert finalinto a Pandas DataFrame, but when I'm doing:

现在我想转换final为 Pandas DataFrame，但是当我这样做时：

final_df = pd.DataFrame(final)

but I got an error.

但我有一个错误。

I guess final is not a proper dictionary, so how can I convert finalto a dictionary? Or is it an other way to convert finalto a DataFrame?

我想 final 不是一个合适的字典，那么我该如何转换final为字典呢？或者它是转换final为 a的另一种方式DataFrame？

Answer 1

采纳答案by EdChum

You can construct using from_dictand pass param orient='index', then call reset_indexso you get a 2 column df:

您可以构造 usingfrom_dict并传递 param orient='index'，然后调用reset_index以便获得 2 列 df：

In [40]:
from collections import Counter
d = Counter({'fb_view_listing': 76, 'fb_homescreen': 63, 'rt_view_listing': 50, 'rt_home_start_app': 46, 'fb_view_wishlist': 39, 'fb_view_product': 37, 'fb_search': 29, 'rt_view_product': 23, 'fb_view_cart': 22, 'rt_search': 12, 'rt_view_cart': 12, 'add_to_cart': 2, 'create_campaign': 1, 'fb_connect': 1, 'sale': 1, 'guest_sale': 1, 'remove_from_cart': 1, 'rt_transaction_confirmation': 1, 'login': 1})
df = pd.DataFrame.from_dict(d, orient='index').reset_index()
df

Out[40]:
                          index   0
0                         login   1
1   rt_transaction_confirmation   1
2                  fb_view_cart  22
3                    fb_connect   1
4               rt_view_product  23
5                     fb_search  29
6                          sale   1
7               fb_view_listing  76
8                   add_to_cart   2
9                  rt_view_cart  12
10                fb_homescreen  63
11              fb_view_product  37
12            rt_home_start_app  46
13             fb_view_wishlist  39
14              create_campaign   1
15                    rt_search  12
16                   guest_sale   1
17             remove_from_cart   1
18              rt_view_listing  50

You can rename the columns to something more meaningful:

您可以将列重命名为更有意义的名称：

In [43]:
df = df.rename(columns={'index':'event', 0:'count'})
df

Out[43]:
                          event  count
0                         login      1
1   rt_transaction_confirmation      1
2                  fb_view_cart     22
3                    fb_connect      1
4               rt_view_product     23
5                     fb_search     29
6                          sale      1
7               fb_view_listing     76
8                   add_to_cart      2
9                  rt_view_cart     12
10                fb_homescreen     63
11              fb_view_product     37
12            rt_home_start_app     46
13             fb_view_wishlist     39
14              create_campaign      1
15                    rt_search     12
16                   guest_sale      1
17             remove_from_cart      1
18              rt_view_listing     50

Answer 2

回答by galath

If you want two columns, set the keyword argument orient='index'when creating a DataFramefrom a dictionary using from_dict:

如果您需要两列，请在使用以下命令从字典orient='index'创建 a 时设置关键字参数：DataFramefrom_dict

final_df = pd.DataFrame.from_dict(final, orient='index')

See the documentation on DataFrame.from_dict

请参阅有关 DataFrame.from_dict的文档

Answer 3

回答by Suzana

I found it more useful to transform the Counter to a pandas Series that is already ordered by count and where the ordered items are the index, so I used zip:

我发现将 Counter 转换为已经按计数排序的 Pandas 系列更有用，其中排序的项目是索引，所以我使用了zip：

def counter_to_series(counter):
  if not counter:
    return pd.Series() 
  counter_as_tuples = counter.most_common(len(counter)) 

  items, counts = zip(*counter_as_tuples)
  return pd.Series(counts, index=items)

The most_commonmethod of the counter object returns a list of (item, count)tuples. zipwill throw an exception when the counter has no items, so an empty Counter must be checked beforehand.

most_commoncounter 对象的方法返回一个(item, count)元组列表。zip当计数器没有物品时会抛出异常，因此必须事先检查空计数器。

Answer 4

回答by pvasek

Another option is to use DataFrame.from_recordsmethod

另一种选择是使用DataFrame.from_records方法

import pandas as pd
from collections import Counter

c = Counter({'fb_view_listing': 76, 'fb_homescreen': 63, 'rt_view_listing': 50, 'rt_home_start_app': 46, 'fb_view_wishlist': 39, 'fb_view_product': 37, 'fb_search': 29, 'rt_view_product': 23, 'fb_view_cart': 22, 'rt_search': 12, 'rt_view_cart': 12, 'add_to_cart': 2, 'create_campaign': 1, 'fb_connect': 1, 'sale': 1, 'guest_sale': 1, 'remove_from_cart': 1, 'rt_transaction_confirmation': 1, 'login': 1})

df = pd.DataFrame.from_records(list(dict(c).items()), columns=['page','count'])

It's a one-liner and speed seems to be the same.

这是一个单线和速度似乎是一样的。

Or use this variant to have them sorted by most used. Again the performance is about the same.

或者使用这个变体让它们按最常用的排序。同样，性能大致相同。

df = pd.DataFrame.from_records(c.most_common(), columns=['page','count'])

Python 将 Counter 对象转换为 Pandas DataFrame

提问by woshitom

采纳答案by EdChum

回答by galath

回答by Suzana

回答by pvasek

相关推荐

最近更新

标签

Python 将 Counter 对象转换为 Pandas DataFrame

提问by woshitom

采纳答案by EdChum

回答by galath

回答by Suzana

回答by pvasek

相关推荐

Python - 两个字符串之间的区别

Python 当 x 和 y 值作为 numpy 数组给出时，查找所有局部最大值和最小值

在python中处理tcpdump输出

Python 将代码从 openCV 更新到 openCV2

相关推荐

最近更新

标签