Python 将常量列添加到 Pandas 数据框的更好方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29337058/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:26:13  来源:igfitidea点击:

Better way to add constant column to pandas data frame

pythonpandas

提问by Haleemur Ali

Currently when I have to add a constant column to an existing data frame, I do the following. To me it seems not all that elegant (the part where I multiply by length of dataframe). Wondering if there are better ways of doing this.

目前,当我必须向现有数据框添加常量列时,我会执行以下操作。对我来说,这似乎不是那么优雅(我乘以数据帧长度的部分)。想知道是否有更好的方法来做到这一点。

import pandas as pd

testdf = pd.DataFrame({'categories': ['bats', 'balls', 'paddles'],
                       'skus': [50, 5000, 32],
                       'sales': [500, 700, 90]})

testdf['avg_sales_per_sku'] = [testdf.sales.sum() / testdf.skus.sum()] * len(testdf)

采纳答案by Geeklhem

You can fill the column implicitly by giving only one number.

您可以通过仅提供一个数字来隐式填充该列。

testdf['avg_sales_per_sku'] = testdf.sales.sum() / testdf.skus.sum() 

From the documentation:

文档

When inserting a scalar value, it will naturally be propagated to fill the column

插入标量值时,自然会传播到填充列

回答by Alexander

It seems confusing to me to mix the categorical average with the aggregate average. You could also use:

将分类平均数与总平均数混合起来似乎让我感到困惑。您还可以使用:

testdf['avg_sales_per_sku'] = testdf.sales / testdf.skus
testdf['avg_agg_sales_per_agg_sku'] = testdf.sales.sum() / float(testdf.skus.sum())  # float is for Python2

>>> testdf
  categories  sales  skus  avg_sales_per_sku  avg_agg_sales_per_agg_sku
0       bats    500    50            10.0000                   0.253837
1      balls    700  5000             0.1400                   0.253837
2    paddles     90    32             2.8125                   0.253837