Python 将常量列添加到 Pandas 数据框的更好方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29337058/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Better way to add constant column to pandas data frame
提问by Haleemur Ali
Currently when I have to add a constant column to an existing data frame, I do the following. To me it seems not all that elegant (the part where I multiply by length of dataframe). Wondering if there are better ways of doing this.
目前,当我必须向现有数据框添加常量列时,我会执行以下操作。对我来说,这似乎不是那么优雅(我乘以数据帧长度的部分)。想知道是否有更好的方法来做到这一点。
import pandas as pd
testdf = pd.DataFrame({'categories': ['bats', 'balls', 'paddles'],
'skus': [50, 5000, 32],
'sales': [500, 700, 90]})
testdf['avg_sales_per_sku'] = [testdf.sales.sum() / testdf.skus.sum()] * len(testdf)
采纳答案by Geeklhem
You can fill the column implicitly by giving only one number.
您可以通过仅提供一个数字来隐式填充该列。
testdf['avg_sales_per_sku'] = testdf.sales.sum() / testdf.skus.sum()
From the documentation:
从文档:
When inserting a scalar value, it will naturally be propagated to fill the column
插入标量值时,自然会传播到填充列
回答by Alexander
It seems confusing to me to mix the categorical average with the aggregate average. You could also use:
将分类平均数与总平均数混合起来似乎让我感到困惑。您还可以使用:
testdf['avg_sales_per_sku'] = testdf.sales / testdf.skus
testdf['avg_agg_sales_per_agg_sku'] = testdf.sales.sum() / float(testdf.skus.sum()) # float is for Python2
>>> testdf
categories sales skus avg_sales_per_sku avg_agg_sales_per_agg_sku
0 bats 500 50 10.0000 0.253837
1 balls 700 5000 0.1400 0.253837
2 paddles 90 32 2.8125 0.253837