pandas 熊猫地图列到位
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36029654/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas map column in place
提问by ars
I've spent some time googling and didn't find answer to the simple question: how can I map column of Pandas dataframe in-place? Say, I have the following df:
我花了一些时间在谷歌上搜索,但没有找到这个简单问题的答案:如何就地映射 Pandas 数据框的列?说,我有以下 df:
In [67]: frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])
In [68]: frame
Out[68]:
b d e
Utah -1.240032 1.586191 -1.272617
Ohio -0.161516 -2.169133 0.223268
Texas -1.921675 0.246167 -0.744242
Oregon 0.371843 2.346133 2.083234
And I want to add 1 to each value of b
column. I know that I can do that like that:
我想为b
列的每个值加 1 。我知道我可以这样做:
In [69]: frame['b'] = frame['b'].map(lambda x: x + 1)
Or like that -- AFAIK there is no difference between map
and apply
in context of Series
(except that map
can also accept dict
or Series
) -- correct me if I'm wrong:
或者像那样 - AFAIK 之间map
和apply
上下文之间没有区别Series
(除了也map
可以接受dict
或Series
) - 如果我错了,请纠正我:
In [71]: frame['b'] = frame['b'].apply(lambda x: x + 1)
But I don't like specifying 'b'
twice. Instead, I would like to do something like that:
但我不喜欢指定'b'
两次。相反,我想做这样的事情:
frame['b'].map(lambda x: x + 1, inplace=True)
Is it possible?
是否可以?
采纳答案by Chris
frame
Out[6]:
b d e
Utah -0.764764 0.663018 -1.806592
Ohio 0.082226 -0.164653 -0.744252
Texas 0.763119 1.492637 -1.434447
Oregon -0.485245 -0.806335 -0.008397
frame['b'] +=1
frame
Out[8]:
b d e
Utah 0.235236 0.663018 -1.806592
Ohio 1.082226 -0.164653 -0.744252
Texas 1.763119 1.492637 -1.434447
Oregon 0.514755 -0.806335 -0.008397
Edit to add:
编辑添加:
If this is an arbitary function, and you really need to apply in place, you can write a thin wrapper around pandas to handle it. Personally I can't imagine a time when it would be that critical that you need to not use the standard implementation (unless perhaps you write a tonne of code and can't be bother to write the extra charecters perhaps??)
如果这是一个任意函数,并且您确实需要就地应用,则可以在 pandas 周围编写一个薄包装器来处理它。就我个人而言,我无法想象什么时候您不需要使用标准实现(除非您编写了大量代码并且可能懒得编写额外的字符??)
from pandas import DataFrame
import numpy as np
class MyWrapper(DataFrame):
def __init__(self, *args, **kwargs):
super(MyWrapper,self).__init__(*args,**kwargs)
def myapply(self,label, func):
self[label]= super(MyWrapper,self).__getitem__(label).apply(func)
df = frame = MyWrapper(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print df
df.myapply('b', lambda x: x+1)
print df
Gives:
给出:
>>
b d e
Utah -0.260549 -0.981025 1.136154
Ohio 0.073732 -0.895937 -0.025134
Texas 0.555507 -1.173679 0.946342
Oregon 1.871728 -0.850992 1.135784
b d e
Utah 0.739451 -0.981025 1.136154
Ohio 1.073732 -0.895937 -0.025134
Texas 1.555507 -1.173679 0.946342
Oregon 2.871728 -0.850992 1.135784
Obviously this is a very minimal example, hopefully which exposes a few methods of interest for you.
显然,这是一个非常小的示例,希望它可以为您提供一些您感兴趣的方法。
回答by Moondra
You can use add
您可以使用 add
In [2]: import pandas as pd
In [3]: import numpy as np
In [4]: frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=
...: ['Utah', 'Ohio', 'Texas', 'Oregon'])
In [5]: frame.head()
Out[5]:
b d e
Utah -1.165332 -0.999244 -0.541742
Ohio -0.319887 0.199094 -0.438669
Texas -1.242524 -0.385092 -0.389616
Oregon 0.331593 0.505496 1.688962
In [6]: frame.b.add(1)
Out[6]:
Utah -0.165332
Ohio 0.680113
Texas -0.242524
Oregon 1.331593
Name: b, dtype: float64
In [7]: