如何在 Python 中设计一个类?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4203163/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I design a class in Python?
提问by Ivo Flipse
I've had some really awesome help on my previous questions for detecting pawsand toes within a paw, but all these solutions only work for one measurement at a time.
对于我之前检测爪子内爪子和脚趾的问题,我得到了一些非常棒的帮助,但所有这些解决方案一次只适用于一次测量。
Now I have datathat consists off:
现在我的数据包括:
- about 30 dogs;
- each has 24 measurements (divided into several subgroups);
- each measurement has at least 4 contacts (one for each paw) and
- each contact is divided into 5 parts and
- has several parameters, like contact time, location, total force etc.
- 大约30只狗;
- 每个有 24 个测量值(分为几个子组);
- 每次测量至少有 4 个触点(每个爪子一个)和
- 每个联系人分为 5 个部分和
- 有几个参数,如接触时间、位置、总力等。


Obviously sticking everything into one big object isn't going to cut it, so I figured I needed to use classes instead of the current slew of functions. But even though I've read Learning Python's chapter about classes, I fail to apply it to my own code (GitHub link)
显然,将所有东西都放在一个大对象中并不会削减它,所以我认为我需要使用类而不是当前的大量函数。但是即使我已经阅读了 Learning Python 中关于类的章节,我也没有将它应用到我自己的代码中(GitHub 链接)
I also feel like it's rather strange to process all the data everytime I want to get out some information. Once I know the locations of each paw, there's no reason for me to calculate this again. Furthermore, I want to compare all the paws of the same dog to determine which contact belongs to which paw (front/hind, left/right). This would become a mess if I continue using only functions.
我也觉得每次我想得到一些信息时都要处理所有数据是很奇怪的。一旦我知道了每个爪子的位置,我就没有理由再次计算了。此外,我想比较同一只狗的所有爪子,以确定哪个接触属于哪个爪子(前/后,左/右)。如果我继续只使用函数,这将变得一团糟。
So now I'm looking for advice on how to create classes that will let me process my data (link to the zipped data of one dog) in a sensible fashion.
所以现在我正在寻找关于如何创建类的建议,让我以一种明智的方式处理我的数据(链接到一只狗的压缩数据)。
采纳答案by S.Lott
How to design a class.
如何设计一个类。
Write down the words. You started to do this. Some people don't and wonder why they have problems.
Expand your set of words into simple statements about what these objects will be doing. That is to say, write down the various calculations you'll be doing on these things. Your short list of 30 dogs, 24 measurements, 4 contacts, and several "parameters" per contact is interesting, but only part of the story. Your "locations of each paw" and "compare all the paws of the same dog to determine which contact belongs to which paw" are the next step in object design.
Underline the nouns. Seriously. Some folks debate the value of this, but I find that for first-time OO developers it helps. Underline the nouns.
Review the nouns. Generic nouns like "parameter" and "measurement" need to be replaced with specific, concrete nouns that apply to your problem in your problem domain. Specifics help clarify the problem. Generics simply elide details.
For each noun ("contact", "paw", "dog", etc.) write down the attributes of that noun and the actions in which that object engages. Don't short-cut this. Every attribute. "Data Set contains 30 Dogs" for example is important.
For each attribute, identify if this is a relationship to a defined noun, or some other kind of "primitive" or "atomic" data like a string or a float or something irreducible.
For each action or operation, you have to identify which noun has the responsibility, and which nouns merely participate. It's a question of "mutability". Some objects get updated, others don't. Mutable objects must own total responsibility for their mutations.
At this point, you can start to transform nouns into class definitions. Some collective nouns are lists, dictionaries, tuples, sets or namedtuples, and you don't need to do very much work. Other classes are more complex, either because of complex derived data or because of some update/mutation which is performed.
写下单词。你开始这样做了。有些人不知道为什么他们有问题。
将您的单词集扩展为关于这些对象将做什么的简单陈述。也就是说,写下你将对这些事情做的各种计算。您的 30 只狗、24 个测量值、4 个联系人和每个联系人的几个“参数”的简短列表很有趣,但只是故事的一部分。您的“每个爪子的位置”和“比较同一只狗的所有爪子以确定哪个接触属于哪个爪子”是对象设计的下一步。
在名词下划线。严重地。有些人争论这个的价值,但我发现对于第一次 OO 开发人员它有帮助。在名词下划线。
复习名词。像“参数”和“测量”这样的通用名词需要替换为适用于您的问题领域中的问题的具体的具体名词。细节有助于澄清问题。泛型只是省略了细节。
对于每个名词(“接触”、“爪子”、“狗”等),写下该名词的属性以及该对象参与的动作。不要走捷径。每个属性。例如,“数据集包含 30 只狗”很重要。
对于每个属性,确定这是否与定义的名词或其他某种“原始”或“原子”数据(如字符串或浮点数或不可约的东西)有关。
对于每个动作或操作,您必须确定哪些名词负责,哪些名词仅参与。这是一个“可变性”的问题。有些对象会更新,有些则不会。可变对象必须对其突变承担全部责任。
此时,您可以开始将名词转换为类定义。一些集合名词是列表、字典、元组、集合或命名元组,您不需要做太多工作。其他类更复杂,要么是因为复杂的派生数据,要么是因为执行了一些更新/变异。
Don't forget to test each class in isolation using unittest.
不要忘记使用 unittest 单独测试每个类。
Also, there's no law that says classes must be mutable. In your case, for example, you have almost no mutable data. What you have is derived data, created by transformation functions from the source dataset.
此外,没有法律规定类必须是可变的。例如,在您的情况下,您几乎没有可变数据。您拥有的是派生数据,由源数据集的转换函数创建。
回答by Spacedman
The whole idea of OO design is to make your code map to your problem, so when, for example, you want the first footstep of a dog, you do something like:
OO 设计的整个想法是让您的代码映射到您的问题,因此,例如,当您想要狗的第一个脚步时,您可以执行以下操作:
dog.footstep(0)
Now, it may be that for your case you need to read in your raw data file and compute the footstep locations. All this could be hidden in the footstep() function so that it only happens once. Something like:
现在,对于您的情况,您可能需要读取原始数据文件并计算足迹位置。所有这些都可以隐藏在footstep() 函数中,这样它只会发生一次。就像是:
class Dog:
def __init__(self):
self._footsteps=None
def footstep(self,n):
if not self._footsteps:
self.readInFootsteps(...)
return self._footsteps[n]
[This is now a sort of caching pattern. The first time it goes and reads the footstep data, subsequent times it just gets it from self._footsteps.]
[这现在是一种缓存模式。它第一次去读取足迹数据,随后它只是从 self._footsteps 中获取它。]
But yes, getting OO design right can be tricky. Think more about the things you want to do to your data, and that will inform what methods you'll need to apply to what classes.
但是,正确的面向对象设计可能很棘手。多考虑您想对数据执行的操作,这将告知您需要将哪些方法应用于哪些类。
回答by mitchelllc
The following advices (similar to @S.Lott's advice) are from the book, Beginning Python: From Novice to Professional
以下建议(类似于@S.Lott 的建议)来自《Beginning Python: From Novice to Professional》一书
Write down a description of your problem (what should the problem do?). Underline all the nouns, verbs, and adjectives.
Go through the nouns, looking for potential classes.
Go through the verbs, looking for potential methods.
Go through the adjectives, looking for potential attributes
Allocate methods and attributes to your classes
写下您的问题的描述(问题应该做什么?)。在所有名词、动词和形容词下划线。
浏览名词,寻找潜在的类。
通过动词,寻找潜在的方法。
浏览形容词,寻找潜在的属性
为类分配方法和属性
To refine the class, the book also advises we can do the following:
为了完善课程,本书还建议我们可以做以下事情:
Write down (or dream up) a set of use cases—scenarios of how your program may be used. Try to cover all the functionally.
Think through every use case step by step, making sure that everything we need is covered.
写下(或想象)一组用例——你的程序可能被如何使用的场景。尝试涵盖所有功能。
一步一步地思考每个用例,确保我们需要的一切都被覆盖。
回答by Les Nightingill
I like the TDD approach... So start by writing tests for what you want the behaviour to be. And write code that passes. At this point, don't worry too much about design, just get a test suite and software that passes. Don't worry if you end up with a single big ugly class, with complex methods.
我喜欢 TDD 方法......所以首先为你想要的行为编写测试。并编写通过的代码。此时,不要太担心设计,只需获得通过的测试套件和软件即可。如果您最终得到一个带有复杂方法的丑陋的大类,请不要担心。
Sometimes, during this initial process, you'll find a behaviour that is hard to test and needs to be decomposed, just for testability. This may be a hint that a separate class is warranted.
有时,在这个初始过程中,你会发现一个行为很难测试,需要分解,只是为了可测试性。这可能暗示需要一个单独的类。
Then the fun part... refactoring. After you have working software you can see the complex pieces. Often little pockets of behaviour will become apparent, suggesting a new class, but if not, just look for ways to simplify the code. Extract service objects and value objects. Simplify your methods.
然后是有趣的部分……重构。在您拥有可运行的软件后,您可以看到复杂的部分。通常,小块的行为会变得明显,建议创建一个新类,但如果没有,只需寻找简化代码的方法。提取服务对象和值对象。简化您的方法。
If you're using git properly (you are using git, aren't you?), you can very quickly experiment with some particular decomposition during refactoring, and then abandon it and revert back if it doesn't simplify things.
如果您正确使用 git(您正在使用 git,不是吗?),您可以在重构期间非常快速地尝试一些特定的分解,然后放弃它并在它不能简化事情的情况下恢复。
By writing tested working code first you should gain an intimate insight into the problem domain that you couldn't easily get with the design-first approach. Writing tests and code push you past that "where do I begin" paralysis.
通过首先编写经过测试的工作代码,您应该深入了解问题域,而这是使用设计优先方法无法轻易获得的。编写测试和代码会让你摆脱“我从哪里开始”的瘫痪。
回答by Evan Moran
Writing out your nouns, verbs, adjectives is a great approach, but I prefer to think of class design as asking the question what data should be hidden?
写出你的名词、动词、形容词是一个很好的方法,但我更喜欢将类设计视为询问应该隐藏哪些数据的问题?
Imagine you had a Queryobject and a Databaseobject:
想象一下你有一个Query对象和一个Database对象:
The Queryobject will help you create and store a query -- store, is the key here, as a function could help you create one just as easily. Maybe you could stay: Query().select('Country').from_table('User').where('Country == "Brazil"'). It doesn't matter exactly the syntax -- that is your job! -- the key is the object is helping you hide something, in this case the data necessary to store and output a query. The power of the object comes from the syntax of using it (in this case some clever chaining) and not needing to know what it stores to make it work. If done right the Queryobject could output queries for more then one database. It internally would store a specific format but could easily convert to other formats when outputting (Postgres, MySQL, MongoDB).
该Query对象将帮助您创建和存储查询——存储是这里的关键,因为函数可以帮助您轻松创建查询。也许你可以留下:Query().select('Country').from_table('User').where('Country == "Brazil"')。语法并不重要——那是你的工作!- 关键是对象正在帮助您隐藏某些东西,在这种情况下是存储和输出查询所需的数据。对象的强大之处在于使用它的语法(在这种情况下是一些巧妙的链接),并且不需要知道它存储了什么来使其工作。如果做得好,该Query对象可以输出对多个数据库的查询。它会在内部存储特定格式,但在输出时可以轻松转换为其他格式(Postgres、MySQL、MongoDB)。
Now let's think through the Databaseobject. What does this hide and store? Well clearly it can't store the full contents of the database, since that is why we have a database! So what is the point? The goal is to hide how the database worksfrom people who use the Databaseobject. Good classes will simplify reasoning when manipulating internal state. For this Databaseobject you could hide how the networking calls work, or batch queries or updates, or provide a caching layer.
现在让我们考虑一下这个Database对象。这隐藏和存储了什么?很明显它不能存储数据库的全部内容,因为这就是我们拥有数据库的原因!那么重点是什么?目标是对使用该对象的人隐藏数据库的工作方式Database。在操作内部状态时,好的类将简化推理。对于这个Database对象,你可以隐藏网络调用的工作方式,或者批量查询或更新,或者提供一个缓存层。
The problem is this Databaseobject is HUGE. It represents how to access a database, so under the covers it could do anything and everything. Clearly networking, caching, and batching are quite hard to deal with depending on your system, so hiding them away would be very helpful. But, as many people will note, a database is insanely complex, and the further from the raw DB calls you get, the harder it is to tune for performance and understand how things work.
问题是这个Database对象是巨大的。它代表了如何访问数据库,所以在幕后它可以做任何事情。显然,根据您的系统,网络、缓存和批处理很难处理,因此将它们隐藏起来会非常有帮助。但是,正如许多人会注意到的那样,数据库极其复杂,并且您获得的原始数据库调用距离越远,调优性能和了解事物的工作原理就越困难。
This is the fundamental tradeoff of OOP. If you pick the right abstraction it makes coding simpler (String, Array, Dictionary), if you pick an abstraction that is too big (Database, EmailManager, NetworkingManager), it may become too complex to really understand how it works, or what to expect. The goal is to hide complexity, but some complexity is necessary. A good rule of thumb is to start out avoiding Managerobjects, and instead create classes that are like structs-- all they do is hold data, with some helper methods to create/manipulate the data to make your life easier. For example, in the case of EmailManagerstart with a function called sendEmailthat takes an Emailobject. This is a simple starting point and the code is very easy to understand.
这是 OOP 的基本权衡。如果你选择了正确的抽象,它会使编码更简单(字符串、数组、字典),如果你选择太大的抽象(数据库、电子邮件管理器、网络管理器),它可能会变得太复杂而无法真正理解它是如何工作的,或者是什么预计。目标是隐藏复杂性,但一些复杂性是必要的。一个好的经验法则是从避免Manager对象开始,而是创建类似的类structs——它们所做的只是保存数据,并使用一些辅助方法来创建/操作数据,以使您的生活更轻松。例如,在EmailManager开始时调用sendEmail一个接受Email对象的函数。这是一个简单的起点,代码很容易理解。
As for your example, think about what data needs to be together to calculate what you are looking for. If you wanted to know how far an animal was walking, for example, you could have AnimalStepand AnimalTrip(collection of AnimalSteps) classes. Now that each Trip has all the Step data, then it should be able to figure stuff out about it, perhaps AnimalTrip.calculateDistance()makes sense.
至于您的示例,请考虑需要将哪些数据放在一起来计算您要查找的内容。例如,如果您想知道动物走了多远,您可以拥有AnimalStep和AnimalTrip(AnimalSteps 的集合)类。现在每个 Trip 都有所有 Step 数据,那么它应该能够弄清楚它的内容,也许AnimalTrip.calculateDistance()是有道理的。
回答by cyborg
After skimming your linked code, it seems to me that you are better off notdesigning a Dog class at this point. Rather, you should use Pandasand dataframes. A dataframe is a table with columns. You dataframe would have columns such as: dog_id, contact_part, contact_time, contact_location, etc.
Pandas uses Numpy arrays behind the scenes, and it has many convenience methods for you:
在浏览了链接的代码之后,在我看来,此时最好不要设计 Dog 类。相反,您应该使用Pandas和dataframes。数据框是一个带有列的表。您数据帧都会有这样的栏目:dog_id,contact_part,contact_time,contact_location,等大熊猫在后台使用numpy的阵列,它已经为你许多方便的方法:
- Select a dog by e.g. :
my_measurements['dog_id']=='Charly' - save the data:
my_measurements.save('filename.pickle') - Consider using
pandas.read_csv()instead of manually reading the text files.
- 选择一只狗,例如:
my_measurements['dog_id']=='Charly' - 保存数据:
my_measurements.save('filename.pickle') - 考虑使用
pandas.read_csv()而不是手动读取文本文件。

