返回将 Pandas 数据帧作为参数的函数的输出

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25069733/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:19:06  来源:igfitidea点击:

Return output of function that takes pandas dataframe as a parameter

pythonpandas

提问by DataSwede

I have a pandas dataframe that looks like:

我有一个看起来像的Pandas数据框:

d = {'some_col' : ['A', 'B', 'C', 'D', 'E'],
     'alert_status' : [1, 2, 0, 0, 5]}
df = pd.DataFrame(d)

Quite a few tasks at my job require the same tasks in pandas. I'm beginning to write standardized functions that will take a dataframe as a parameter and return something. Here's a simple one:

我的工作中有相当多的任务需要在 Pandas 中执行相同的任务。我开始编写标准化函数,将数据帧作为参数并返回一些东西。这是一个简单的:

def alert_read_text(df, alert_status=None):
    if (alert_status is None):
        print 'Warning: A column name with the alerts must be specified'
    alert_read_criteria = df[alert_status] >= 1
    df[alert_status].loc[alert_read_criteria] = 1
    alert_status_dict = {0 : 'Not Read',
                         1 : 'Read'}
    df[alert_status] = df[alert_status].map(alert_status_dict)
    return df[alert_status]

I'm looking to have the function return a series. This way, one could add a column to an existing data frame:

我希望该函数返回一个系列。这样,就可以向现有数据框中添加一列:

df['alert_status_text'] = alert_read_text(df, alert_status='alert_status')

However, currently, this function will correctly return a series, but also modifies the existing column. How do you make it so the original column passed in does not get modified?

但是,目前,此函数将正确返回一个系列,但也会修改现有列。你如何做到这样传入的原始列不会被修改?

采纳答案by EdChum

As you've discovered the passed in dataframe will be modified as params are passed by reference, this is true in python and nothing to do with pandas as such.

正如您发现传入的数据帧将被修改,因为 params 是通过引用传递的,这在 python 中是正确的,与 Pandas 无关。

So if you don't want to modify the passed df then take a copy:

因此,如果您不想修改传递的 df,请复制一份:

def alert_read_text(df, alert_status=None):
    if (alert_status is None):
        print 'Warning: A column name with the alerts must be specified'
    copy = df.copy()
    alert_read_criteria = copy[alert_status] >= 1
    copy[alert_status].loc[alert_read_criteria] = 1
    alert_status_dict = {0 : 'Not Read',
                         1 : 'Read'}
    copy[alert_status] = copy[alert_status].map(alert_status_dict)
    return copy[alert_status]

Also see related: pandas dataframe, copy by value

另请参阅相关:pandas 数据框,按值复制