pandas Python 将类方法应用于数据框的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34630340/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:27:51  来源:igfitidea点击:

Python apply class method to row of data frame

pythonpandasvectorization

提问by Conan

My class takes a row of a dataframe to construct an object and I would like to create an array of objects by applying init to every row of a dataframe. Is there a way to vectorize this? My class definition looks like

我的类需要一行数据帧来构造一个对象,我想通过将 init 应用于数据帧的每一行来创建一个对象数组。有没有办法将其矢量化?我的类定义看起来像

class A(object):
    def __init__(self,row):
        self.a = row['a']
        self.b = row['b']

Any suggestion will be highly appreciated!

任何建议将不胜感激!

I have one way which I am not that satisfied with to solve this problem. Define another function outside of class and then use apply.

我有一种我不太满意的方法来解决这个问题。在类之外定义另一个函数,然后使用 apply。

def InitA(row):
    return A(row)

Assume df is the data frame I want to use as argument.

假设 df 是我想用作参数的数据框。

xxx = df.apply(InitA,axis=1)

gives what I want. However, I don't think InitA is necessary.

给我想要的。但是,我不认为 InitA 是必要的。

My original problem is a bit more complicated. The class definition is

我原来的问题有点复杂。类定义是

class A(object):
    def __init__(self):
        return
    def add_parameter(self,row):
        self.a = row['a']

I intend to apply add_parameter to every row of a data frame. But I think defining another (lambda) function is necessary to solve this problem.

我打算将 add_parameter 应用于数据框的每一行。但我认为定义另一个 (lambda) 函数对于解决这个问题是必要的。

采纳答案by McRip

Just use a lambda function?

只使用 lambda 函数?

xxx = df.apply(lambda x: A(x),axis=1)

xxx = df.apply(lambda x: A(x),axis=1)

edit: Another solution is to directly pass the class, the apply-function then calls the constructor:

编辑:另一种解决方案是直接传递类,然后应用函数调用构造函数:

xxx = df.apply(A,axis=1)

xxx = df.apply(A,axis=1)

this works:

这有效:

import pandas as pd 

class C(object):
    def __init__(self,dat):
        return

A = pd.DataFrame({'a':pd.Series([1,2,3])})
A.apply(lambda x: C(x),axis=1)