pandas 如何在 Python 中重定向包含类的所有方法?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13460889/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to redirect all methods of a contained class in Python?
提问by Yariv
How to implement the composition pattern? I have a class Containerwhich has an attribute object Contained. I would like to redirect/allow access to all methods of Containedclass from Containerby simply calling my_container.some_contained_method(). Am I doing the right thing in the right way?
如何实现组合模式?我有一个Container具有属性 object 的类Contained。我想重定向/允许访问的所有方法Contained从类Container通过简单地调用my_container.some_contained_method()。我是否以正确的方式做正确的事?
I use something like:
我使用类似的东西:
class Container:
def __init__(self):
self.contained = Contained()
def __getattr__(self, item):
if item in self.__dict__: # some overridden
return self.__dict__[item]
else:
return self.contained.__getattr__(item) # redirection
Background:
背景:
I am trying to build a class (Indicator) that adds to the functionality of an existing class (pandas.DataFrame). Indicatorwill have all the methods of DataFrame. I could use inheritance, but I am following the "favor compositionover inheritance" advice (see, e.g., the answers in: python: inheriting or composition). One reason not to inherit is because the base class is not serializable and I need to serialize.
我正在尝试构建一个类 ( Indicator) 来增加现有类 ( pandas.DataFrame) 的功能。Indicator将拥有 的所有方法DataFrame。我可以使用继承,但我遵循“优先组合而不是继承”的建议(例如,参见:python: inheriting 或 composition 中的答案)。不继承的原因之一是基类不可序列化,我需要序列化。
I have found this, but I am not sure if it fits my needs.
我找到了这个,但我不确定它是否符合我的需要。
回答by unutbu
Caveats:
注意事项:
- DataFrames have a lot of attributes. If a
DataFrameattribute is a number, you probably just want to return that number. But if theDataFrameattribute isDataFrameyou probably want to return aContainer. What should we do if theDataFrameattribute is aSeriesor a descriptor? To implementContainer.__getattr__properly, you really have to write unit tests for each and every attribute. - Unit testing is also needed for
__getitem__. - You'll also have to define and unit test
__setattr__and__setitem__,__iter__,__len__, etc. - Pickling is a form of serialization, so if
DataFramesare picklable, I'm not sure howContainers really help with serialization.
- DataFrames 有很多属性。如果
DataFrame属性是一个数字,您可能只想返回该数字。但是如果DataFrame属性是DataFrame你可能想要返回一个Container. 如果DataFrame属性是aSeries或者descriptor怎么办?要Container.__getattr__正确实施,您确实必须为每个属性编写单元测试。 - 还需要单元测试
__getitem__。 - 您还可以定义和单元测试
__setattr__和__setitem__,__iter__,__len__,等。 - 酸洗是序列化的一种形式,所以如果
DataFrames是可酸洗的,我不确定如何Container真正帮助序列化。
Some comments:
一些评论:
__getattr__is only called if the attribute is not inself.__dict__. So you do not needif item in self.__dict__in your__getattr__.self.contained.__getattr__(item)callsself.contained's__getattr__method directly. That is usually not what you want to do, because it circumvents the whole Python attribute lookup mechanism. For example, it ignores the possibility that the attribute could be inself.contained.__dict__, or in the__dict__of one of the bases ofself.contained.__class__or ifitemrefers to a descriptor. Instead usegetattr(self.contained, item).
__getattr__仅当属性不在 中时才调用self.__dict__。所以你不需要if item in self.__dict__在你的__getattr__.self.contained.__getattr__(item)直接调用self.contained的__getattr__方法。这通常不是您想要做的,因为它绕过了整个 Python 属性查找机制。例如,它忽略了属性可能在self.contained.__dict__,或在或 if引用描述符__dict__的基础之一的可能性。而是使用.self.contained.__class__itemgetattr(self.contained, item)
import pandas
import numpy as np
def tocontainer(func):
def wrapper(*args, **kwargs):
result = func(*args, **kwargs)
return Container(result)
return wrapper
class Container(object):
def __init__(self, df):
self.contained = df
def __getitem__(self, item):
result = self.contained[item]
if isinstance(result, type(self.contained)):
result = Container(result)
return result
def __getattr__(self, item):
result = getattr(self.contained, item)
if callable(result):
result = tocontainer(result)
return result
def __repr__(self):
return repr(self.contained)
Here is some random code to test if -- at least superficially -- Containerdelegates to DataFrames properly and returns Containers:
这是一些随机代码,用于测试是否 - 至少在表面上 -正确地Container委托给DataFrames 并返回Containers:
df = pandas.DataFrame(
[(1, 2), (1, 3), (1, 4), (2, 1),(2,2,)], columns=['col1', 'col2'])
df = Container(df)
df['col1'][3] = 0
print(df)
# col1 col2
# 0 1 2
# 1 1 3
# 2 1 4
# 3 2 1
# 4 2 2
gp = df.groupby('col1').aggregate(np.count_nonzero)
print(gp)
# col2
# col1
# 1 3
# 2 2
print(type(gp))
# <class '__main__.Container'>
print(type(gp[gp.col2 > 2]))
# <class '__main__.Container'>
tf = gp[gp.col2 > 2].reset_index()
print(type(tf))
# <class '__main__.Container'>
result = df[df.col1 == tf.col1]
print(type(result))
# <class '__main__.Container'>
回答by Waylon Walker
I found unbutbu 's answer very useful for my own application, I ran into issues displaying it properly in a jupyter notebook. I found that adding the following methods to the class solved the issue.
我发现 unbutbu 的答案对我自己的应用程序非常有用,但在 jupyter 笔记本中正确显示它时遇到了问题。我发现在类中添加以下方法解决了这个问题。
def _repr_html_(self):
return self.contained._repr_html_()
def _repr_latex_(self):
return self.contained._repr_latex_()

