pandas 如何使用尽可能少的代码在 Jupyter notebook 中使用 Python 创建给定数据的频率分布表？

Question

提问by Mainul Islam

Develop a frequency distribution summarizing this data.This data is a demand for an object over a period of 20 days.

开发总结此数据的频率分布。此数据是一个对象在 20 天内的需求。

2 1 0 2 1 3 0 2 4 0 3 2 3 4 2 2 2 4 3 0. The task is to create a table in the jupyter notebook with columns Demand and Frequency. Note: Demand has to be in ascending order. This is what I did.

2 1 0 2 1 3 0 2 4 0 3 2 3 4 2 2 2 4 3 0. 任务是在 jupyter notebook 中创建一个表格，其中包含需求和频率列。注意：需求必须按升序排列。这就是我所做的。

list_of_days = [2, 1, 0, 2, 1, 3, 0, 2, 4, 0, 3, 2 ,3, 4, 2, 2, 2, 4, 3, 0] # created a list of the data
import pandas as pd
series_of_days = pd.Series(list_of_days) # converted the list to series
series_of_days.value_counts(ascending = True) # the frequency was ascending but not the demand
test = dict(series_of_days.value_counts())
freq_table =  pd.Series(test)
pd.DataFrame({"Demand":freq_table.index, "Frequency":freq_table.values})

The output has to be like this:

输出必须是这样的：

<table border = "1">

  <tr>
    <td>Demand</td>
    <td>Frequency</td>
  </tr>
  <tr>
    <td>0</td>
    <td>4</td>
  </tr>
  <tr>
    <td>1</td>
    <td>2</td>
  </tr>
  <tr>
    <td>2</td>
    <td>7</td>
  </tr>
<table>

and so on. Is there a better way to shorten the Python code? Or make it more efficient?

等等。有没有更好的方法来缩短 Python 代码？还是让它更有效率？

Answer 1

回答by jezrael

You can use value_countswith reset_indexand sorting by sort_values:

您可以使用value_countswithreset_index和排序方式sort_values：

df1 = pd.Series(list_of_days).value_counts()
        .reset_index()
        .sort_values('index')
        .reset_index(drop=True)
df1.columns = ['Demand', 'Frequency']
print (df1)
   Demand  Frequency
0       0          4
1       1          2
2       2          7
3       3          4
4       4          3

Another similar solution with sorting by sort_index:

排序方式的另一个类似解决方案sort_index：

df1 = pd.Series(list_of_days)
        .value_counts()
        .sort_index()
        .reset_index()
        .reset_index(drop=True)
df1.columns = ['Demand', 'Frequency']
print (df1)
   Demand  Frequency
0       0          4
1       1          2
2       2          7
3       3          4
4       4          3

Answer 2

回答by Mohammad Athar

import collections
collections.Counter(list_of_days)

Should do what you're describing

应该做你所描述的

Answer 3

回答by piRSquared

I'm going for the literal creation of the HTML table you posted

我要创建您发布的 HTML 表格

pd.value_counts([2,1,0,2,1,3,0,2,4,0,3,2,3,4,2,2,2,4,3,0]).to_frame(name='Frequency').rename_axis('Demand', 1).sort_index()

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>Demand</th>
      <th>Frequency</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>4</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
    </tr>
    <tr>
      <th>2</th>
      <td>7</td>
    </tr>
    <tr>
      <th>3</th>
      <td>4</td>
    </tr>
    <tr>
      <th>4</th>
      <td>3</td>
    </tr>
  </tbody>
</table>

Answer 4

回答by Po Stevanus Andrianta

if you want shortest, probably this code, Counter by default will sort the key in ascending.

如果你想要最短的，可能是这个代码，默认情况下 Counter 会按升序对键进行排序。

list_of_days = [2, 1, 0, 2, 1, 3, 0, 2, 4, 0, 3, 2, 3, 4, 2, 2, 2, 4, 3, 0]  
day_counter = Counter(list_of_days).items()
data = [ [a,b] for a,b in day_counter ]
print(data)

[[0, 4], [1, 2], [2, 7], [3, 4], [4, 3]]

pandas 如何使用尽可能少的代码在 Jupyter notebook 中使用 Python 创建给定数据的频率分布表？

提问by Mainul Islam

回答by jezrael

回答by Mohammad Athar

回答by piRSquared

回答by Po Stevanus Andrianta

相关推荐

最近更新

标签

pandas 如何使用尽可能少的代码在 Jupyter notebook 中使用 Python 创建给定数据的频率分布表？

提问by Mainul Islam

回答by jezrael

回答by Mohammad Athar

回答by piRSquared

回答by Po Stevanus Andrianta

相关推荐

自加入 Pandas

pandas 如何为所选列选择一行中的最大值和最小值

在没有 elasticsearch-py 的情况下将 Pandas 数据帧索引到 Elasticsearch

Pandas dataframe.to_html() - 为标题添加背景颜色

相关推荐

最近更新

标签