pandas DataFrame：如何使用自定义方式剪切数据框？

Question

提问by Peng He

I want to cut a DataFrame to several dataframes using my own rules.

我想使用我自己的规则将一个 DataFrame 剪切为多个数据帧。

>>> data = pd.DataFrame({'distance':[1,2,3,4,5,6,7,8,9,10],'values':np.arange(0,1,0.1)})
>>> data
   distance  values
0         1     0.0
1         2     0.1
2         3     0.2
3         4     0.3
4         5     0.4
5         6     0.5
6         7     0.6
7         8     0.7
8         9     0.8
9        10     0.9

I'll cut dataaccording to values of distancecolumn. For example, there's some bins [1,3),[3,8),[8,10),[10,10+), if data's column distancein same bin,I separate them into same group and compute column valuesaverage value or sum value.That is

我会data根据distance列的值进行切割。例如，有一些 bins [1,3),[3,8),[8,10),[10,10+)，如果数据的列distance在同一个 bin 中，我将它们分成同一组并计算列values平均值或总和值。那就是

>>> data1 = data[lambda df:(df.distance >= 1) & (df.distance < 3)]
>>> data1
   distance  values
0         1     0.0
1         2     0.1
>>> np.mean(data1['values'])
0.05

How can I cut origin DataFrame into several groups(and then save them,process them...) efficiently?

如何有效地将原始 DataFrame 分成几组（然后保存它们，处理它们......）？

Answer 1

回答by Bob Baxley

Pandas cutcommand is useful for this:

Pandas cut命令对此很有用：

data['categories']=pd.cut(data['distance'],[-np.inf,1,3,8,10,np.inf],right=False)
data.groupby('categories').mean()

Output:

输出：

            distance    values
categories      
[-inf, 1)   NaN     NaN
[1, 3)      1.5     0.05
[3, 8)      5.0     0.40
[8, 10)     8.5     0.75
[10, inf)   10.0    0.90

pandas DataFrame：如何使用自定义方式剪切数据框？

提问by Peng He

回答by Bob Baxley

相关推荐

最近更新

标签

pandas DataFrame：如何使用自定义方式剪切数据框？

提问by Peng He

回答by Bob Baxley

相关推荐

如何在 Pandas 数据框中用 NaN 替换一系列值？

在 python 中读取 RDa 文件作为 Pandas 数据框

熊猫系列（pandas.Series.query（））是否有查询方法或类似方法？

pandas 如何从pickle文件中获取数据到pandas数据框中

相关推荐

最近更新

标签