Python 计算新列作为其他列熊猫的平均值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48366506/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculate new column as the mean of other columns pandas
提问by Carmen Pérez Carrillo
I have a this data frame:
我有一个这个数据框:
and I would like to calculate a new columns as de the mean of salary_1
, salary_2
and salary_3
.
我想计算一个新列作为salary_1
,salary_2
和的平均值salary_3
。
df = pd.DataFrame({'salary_1':[230,345,222],'salary_2':[235,375,292],'salary_3':[210,385,260]})
salary_1 salary_2 salary_3
0 230 235 210
1 345 375 385
2 222 292 260
How can I do it in pandas in the most efficient way? Actually I have many more columns and I don't want to write this one by one.
我怎样才能以最有效的方式在熊猫中做到这一点?其实我还有很多专栏,不想一一写了。
Something like this:
像这样的东西:
salary_1 salary_2 salary_3 salary_mean
0 230 235 210 (230+235+210)/3
1 345 375 385 ...
2 222 292 260 ...
Thank you!
谢谢!
采纳答案by Mr. Stark
an easy way to solve this problem is shown below :
解决此问题的简单方法如下所示:
col = df.loc[: , "salary_1":"salary_3"]
where "salary_1" is the start column name and "salary_3" is the end column name
其中“salary_1”是开始列名,“salary_3”是结束列名
df['salary_mean'] = col.mean(axis=1)
df
This will give you a new dataframe with a new column that shows the mean of all the other columnsThis approach is really helpful when you are having a large set of columns or also helpful when you need to perform on only some selected columns not on all.
这会给你一个新的专栏的一个新数据框,显示意味着所有的其他列的这种做法真的是有用的,当你有一个大列集或也有帮助,当你需要不是所有的执行上只有一些选定列.
回答by Alex
Use .mean
. By specifying the axis you can take the average across the row or the column.
使用.mean
. 通过指定轴,您可以取整行或列的平均值。
df['average'] = df.mean(axis=1)
df
returns
返回
salary_1 salary_2 salary_3 average
0 230 235 210 225.000000
1 345 375 385 368.333333
2 222 292 260 258.000000
If you only want the mean of a few you can select only those columns. E.g.
如果你只想要几个的平均值,你可以只选择那些列。例如
df['average_1_3'] = df[['salary_1', 'salary_3']].mean(axis=1)
df
returns
返回
salary_1 salary_2 salary_3 average_1_3
0 230 235 210 220.0
1 345 375 385 365.0
2 222 292 260 241.0