pandas 熊猫地图列到位

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36029654/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:52:57  来源:igfitidea点击:

Pandas map column in place

pythonnumpypandas

提问by ars

I've spent some time googling and didn't find answer to the simple question: how can I map column of Pandas dataframe in-place? Say, I have the following df:

我花了一些时间在谷歌上搜索,但没有找到这个简单问题的答案:如何就地映射 Pandas 数据框的列?说,我有以下 df:

In [67]: frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])

In [68]: frame
Out[68]: 
               b         d         e
Utah   -1.240032  1.586191 -1.272617
Ohio   -0.161516 -2.169133  0.223268
Texas  -1.921675  0.246167 -0.744242
Oregon  0.371843  2.346133  2.083234

And I want to add 1 to each value of bcolumn. I know that I can do that like that:

我想为b列的每个值加 1 。我知道我可以这样做:

In [69]: frame['b'] = frame['b'].map(lambda x: x + 1)

Or like that -- AFAIK there is no difference between mapand applyin context of Series(except that mapcan also accept dictor Series) -- correct me if I'm wrong:

或者像那样 - AFAIK 之间mapapply上下文之间没有区别Series(除了也map可以接受dictSeries) - 如果我错了,请纠正我:

In [71]: frame['b'] = frame['b'].apply(lambda x: x + 1)

But I don't like specifying 'b'twice. Instead, I would like to do something like that:

但我不喜欢指定'b'两次。相反,我想做这样的事情:

frame['b'].map(lambda x: x + 1, inplace=True)

Is it possible?

是否可以?

采纳答案by Chris

frame
Out[6]: 
               b         d         e
Utah   -0.764764  0.663018 -1.806592
Ohio    0.082226 -0.164653 -0.744252
Texas   0.763119  1.492637 -1.434447
Oregon -0.485245 -0.806335 -0.008397

frame['b'] +=1

frame
Out[8]: 
               b         d         e    
Utah    0.235236  0.663018 -1.806592
Ohio    1.082226 -0.164653 -0.744252
Texas   1.763119  1.492637 -1.434447
Oregon  0.514755 -0.806335 -0.008397

Edit to add:

编辑添加:

If this is an arbitary function, and you really need to apply in place, you can write a thin wrapper around pandas to handle it. Personally I can't imagine a time when it would be that critical that you need to not use the standard implementation (unless perhaps you write a tonne of code and can't be bother to write the extra charecters perhaps??)

如果这是一个任意函数,并且您确实需要就地应用,则可以在 pandas 周围编写一个薄包装器来处理它。就我个人而言,我无法想象什么时候您不需要使用标准实现(除非您编写了大量代码并且可能懒得编写额外的字符??)

from pandas import DataFrame
import numpy as np

class MyWrapper(DataFrame):
    def __init__(self, *args, **kwargs):
        super(MyWrapper,self).__init__(*args,**kwargs)

    def myapply(self,label, func):
        self[label]= super(MyWrapper,self).__getitem__(label).apply(func)


df =  frame = MyWrapper(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print df
df.myapply('b', lambda x: x+1)
print df

Gives:

给出:

>>   
               b         d         e
Utah   -0.260549 -0.981025  1.136154
Ohio    0.073732 -0.895937 -0.025134
Texas   0.555507 -1.173679  0.946342
Oregon  1.871728 -0.850992  1.135784
               b         d         e
Utah    0.739451 -0.981025  1.136154
Ohio    1.073732 -0.895937 -0.025134
Texas   1.555507 -1.173679  0.946342
Oregon  2.871728 -0.850992  1.135784

Obviously this is a very minimal example, hopefully which exposes a few methods of interest for you.

显然,这是一个非常小的示例,希望它可以为您提供一些您感兴趣的方法。

回答by Moondra

You can use add

您可以使用 add

In [2]: import pandas as pd

In [3]: import numpy as np

In [4]: frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=
   ...: ['Utah', 'Ohio', 'Texas', 'Oregon'])

In [5]: frame.head()
Out[5]:
               b         d         e
Utah   -1.165332 -0.999244 -0.541742
Ohio   -0.319887  0.199094 -0.438669
Texas  -1.242524 -0.385092 -0.389616
Oregon  0.331593  0.505496  1.688962

In [6]: frame.b.add(1)
Out[6]:
Utah     -0.165332
Ohio      0.680113
Texas    -0.242524
Oregon    1.331593
Name: b, dtype: float64

In [7]: