pandas 如何一次将函数应用于熊猫数据框中的多列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22086619/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:45:04  来源:igfitidea点击:

how to apply a function to multiple columns in a pandas dataframe at one time

pythonpandasfilteringslice

提问by yoshiserry

I frequently deal with data which is poorly formatted (I.e. number fields are not consistent etc)

我经常处理格式不佳的数据(即数字字段不一致等)

There may be other ways, which I am not aware of but the way I format a single column in a dataframe is by using a function and mapping the column to that function.

可能还有其他方法,我不知道,但我格式化数据框中单个列的方式是使用函数并将该列映射到该函数。

format = df.column_name.map(format_number)

Question: 1 - what if I have a dataframe with 50 columns, and want to apply that formatting to multiple columns, etc column 1, 3, 5, 7, 9,

问题:1 - 如果我有一个包含 50 列的数据框,并且想要将该格式应用于多列等第 1、3、5、7、9 列,该怎么办?

Can you go:

你可以去吗:

format = df.1,3,5,9.map(format_number)

.. This way I could format all my number columns in one line?

.. 这样我可以在一行中格式化所有数字列吗?

回答by BrenBarn

You can do df[['Col1', 'Col2', 'Col3']].applymap(format_number). Note, though that this will return new columns; it won't modify the existing DataFrame. If you want to put the values back in the original, you'll have to do df[['Col1', 'Col2', 'Col3']] = df[['Col1', 'Col2', 'Col3']].applymap(format_number).

你可以做到df[['Col1', 'Col2', 'Col3']].applymap(format_number)。请注意,尽管这将返回新列;它不会修改现有的 DataFrame。如果要将值放回原始值,则必须执行df[['Col1', 'Col2', 'Col3']] = df[['Col1', 'Col2', 'Col3']].applymap(format_number).

回答by EdChum

You could use applylike this:

你可以这样使用apply

df.apply(lambda row: format_number(row), axis=1)

You would need to specify the columns though in your format_numberfunction:

您需要在format_number函数中指定列:

def format_number(row):
    row['Col1'] = doSomething(row['Col1']
    row['Col2'] = doSomething(row['Col2'])
    row['Col3'] = doSomething(row['Col3'])

This is not as elegant as @BrenBarn's answer but it has an advantage that the dataframe is modified in place so you don't need to assign the columns back again

这不像@BrenBarn 的回答那么优雅,但它的优点是数据框被修改到位,因此您不需要再次分配列