Python 什么是数据类,它们与普通类有何不同?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47955263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:27:08  来源:igfitidea点击:

What are data classes and how are they different from common classes?

pythonclasspython-3.7python-dataclasses

提问by kingJulian

With PEP 557data classes are introduced into python standard library.

随着PEP 557数据类被引入 python 标准库。

They make use of the @dataclassdecorator and they are supposed to be "mutable namedtuples with default" but I'm not really sure I understand what this actually means and how they are different from common classes.

它们使用@dataclass装饰器,并且它们应该是“具有默认值的可变命名元组”,但我不确定我是否真正理解这实际上意味着什么以及它们与普通类有何不同。

What exactly are python data classes and when is it best to use them?

python 数据类到底是什么,什么时候最好使用它们?

回答by Martijn Pieters

Data classes are just regular classes that are geared towards storing state, more than contain a lot of logic. Every time you create a class that mostly consists of attributes you made a data class.

数据类只是用于存储状态的常规类,不仅仅是包含很多逻辑。每次您创建一个主要由属性组成的类时,您都会创建一个数据类。

What the dataclassesmodule does is make it easierto create data classes. It takes care of a lot of boiler plate for you.

dataclasses模块的作用是使创建数据类变得更容易。它为您处理了很多样板。

This is especially important when your data class must be hashable; this requires a __hash__method as well as an __eq__method. If you add a custom __repr__method for ease of debugging, that can become quite verbose:

当您的数据类必须是可散列的时,这一点尤其重要;这需要一种__hash__方法,也需要一种__eq__方法。如果您添加自定义__repr__方法以方便调试,则可能会变得非常冗长:

class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def __init__(
            self, 
            name: str, 
            unit_price: float,
            quantity_on_hand: int = 0
        ) -> None:
        self.name = name
        self.unit_price = unit_price
        self.quantity_on_hand = quantity_on_hand

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

    def __repr__(self) -> str:
        return (
            'InventoryItem('
            f'name={self.name!r}, unit_price={self.unit_price!r}, '
            f'quantity_on_hand={self.quantity_on_hand!r})'

    def __hash__(self) -> int:
        return hash((self.name, self.unit_price, self.quantity_on_hand))

    def __eq__(self, other) -> bool:
        if not isinstance(other, InventoryItem):
            return NotImplemented
        return (
            (self.name, self.unit_price, self.quantity_on_hand) == 
            (other.name, other.unit_price, other.quantity_on_hand))

With dataclassesyou can reduce it to:

有了dataclasses你可以将其降低到:

from dataclasses import dataclass

@dataclass(unsafe_hash=True)
class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

The same class decorator can also generate comparison methods (__lt__, __gt__, etc.) and handle immutability.

同一类的装饰也可以产生比较方法(__lt____gt__等)和手柄不变性。

namedtupleclasses are also data classes, but are immutable by default (as well as being sequences). dataclassesare much more flexible in this regard, and can easily be structured such that they can fill the same role as a namedtupleclass.

namedtuple类也是数据类,但默认情况下是不可变的(以及序列)。dataclasses在这方面更加灵活,并且可以很容易地进行结构化,以便它们可以扮演与namedtuple类相同的角色

The PEP was inspired by the attrsproject, which can do even more (including slots, validators, converters, metadata, etc.).

PEP 的灵感来自于该attrs项目,它可以做更多的事情(包括槽、验证器、转换器、元数据等)。

If you want to see some examples, I recently used dataclassesfor several of my Advent of Codesolutions, see the solutions for day 7, day 8, day 11and day 20.

如果你想看到一些例子,我最近使用dataclasses了几个我的代码的问世解决方案,请参阅解决方案7天8天11天20天

If you want to use dataclassesmodule in Python versions < 3.7, then you could install the backported module(requires 3.6) or use the attrsproject mentioned above.

如果您想dataclasses在 Python 版本 < 3.7 中使用模块,那么您可以安装向后移植的模块(需要 3.6)或使用attrs上面提到的项目。

回答by pylang

Overview

概述

The question has been addressed. However, this answer adds some practical examples to aid in the basic understanding of dataclasses.

该问题已得到解决。但是,此答案添加了一些实际示例以帮助对数据类进行基本理解。

What exactly are python data classes and when is it best to use them?

python 数据类到底是什么,什么时候最好使用它们?

  1. code generators: generate boilerplate code; you can choose to implement special methods in a regular class or have a dataclass implement them automatically.
  2. data containers: structures that hold data (e.g. tuples and dicts), often with dotted, attribute access such as classes, namedtupleand others.
  1. 代码生成器:生成样板代码;您可以选择在常规类中实现特殊方法或让数据类自动实现它们。
  2. 数据容器:保存数据的结构(例如元组和字典),通常带有虚线的属性访问,例如namedtuple

"mutable namedtuples with default[s]"

“具有默认值的可变命名元组[s]”

Here is what the latter phrase means:

这是后一句的意思:

  • mutable: by default, dataclass attributes can be reassigned. You can optionally make them immutable (see Examples below).
  • namedtuple: you have dotted, attribute access like a namedtupleor a regular class.
  • default: you can assign default values to attributes.
  • mutable:默认情况下,可以重新分配数据类属性。您可以选择使它们不可变(参见下面的示例)。
  • namedtuple:您有像 anamedtuple或常规类一样的点状属性访问。
  • default:您可以为属性分配默认值。

Compared to common classes, you primarily save on typing boilerplate code.

与普通类相比,您主要节省了键入样板代码的时间。



Features

特征

This is an overview of dataclass features (TL;DR? See the Summary Table in the next section).

这是数据类特性的概述(TL;DR?请参阅下一节中的汇总表)。

What you get

你得到什么

Here are features you get by default from dataclasses.

以下是您默认从数据类获得的功能。

Attributes + Representation + Comparison

属性+表示+比较

import dataclasses


@dataclasses.dataclass
#@dataclasses.dataclass()                                       # alternative
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

These defaults are provided by automatically setting the following keywords to True:

这些默认值是通过将以下关键字自动设置为 来提供的True

@dataclasses.dataclass(init=True, repr=True, eq=True)

What you can turn on

您可以开启的功能

Additional features are available if the appropriate keywords are set to True.

如果将适当的关键字设置为 ,则可以使用其他功能True

Order

命令

@dataclasses.dataclass(order=True)
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

The ordering methods are now implemented (overloading operators: < > <= >=), similarly to functools.total_orderingwith stronger equality tests.

现在实现了排序方法(重载运算符< > <= >=:),类似于functools.total_ordering更强的相等性测试。

Hashable, Mutable

可哈希、可变

@dataclasses.dataclass(unsafe_hash=True)                        # override base `__hash__`
class Color:
    ...

Although the object is potentially mutable (possibly undesired), a hash is implemented.

尽管该对象可能是可变的(可能是不需要的),但实现了哈希。

Hashable, Immutable

可哈希、不可变

@dataclasses.dataclass(frozen=True)                             # `eq=True` (default) to be immutable 
class Color:
    ...

A hash is now implemented and changing the object or assigning to attributes is disallowed.

现在实现了散列,并且不允许更改对象或分配给属性。

Overall, the object is hashable if either unsafe_hash=Trueor frozen=True.

总体而言,如果unsafe_hash=True或,则对象是可散列的frozen=True

See also the original hashing logic tablewith more details.

另请参阅原始散列逻辑表的更多详细信息。

What you don't get

你没有得到的

To get the following features, special methods must be manually implemented:

要获得以下功能,必须手动实现特殊方法:

Unpacking

开箱

@dataclasses.dataclass
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

    def __iter__(self):
        yield from dataclasses.astuple(self)

Optimization

优化

@dataclasses.dataclass
class SlottedColor:
    __slots__ = ["r", "b", "g"]
    r : int
    g : int
    b : int

The object size is now reduced:

对象大小现在减少了:

>>> imp sys
>>> sys.getsizeof(Color)
1056
>>> sys.getsizeof(SlottedColor)
888

In some circumstances, __slots__also improves the speed of creating instances and accessing attributes. Also, slots do not allow default assignments; otherwise, a ValueErroris raised.

在某些情况下,__slots__还提高了创建实例和访问属性的速度。此外,插槽不允许默认分配;否则,aValueError被提高。

See more on slots in this blog post.

在此博客文章中查看有关插槽的更多信息



Summary Table

汇总表

+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+
|       Feature        |       Keyword        |                      Example                       |           Implement in a Class          |
+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+
| Attributes           |  init                |  Color().r -> 0                                    |  __init__                               |
| Representation       |  repr                |  Color() -> Color(r=0, g=0, b=0)                   |  __repr__                               |
| Comparision*         |  eq                  |  Color() == Color(0, 0, 0) -> True                 |  __eq__                                 |
|                      |                      |                                                    |                                         |
| Order                |  order               |  sorted([Color(0, 50, 0), Color()]) -> ...         |  __lt__, __le__, __gt__, __ge__         |
| Hashable             |  unsafe_hash/frozen  |  {Color(), {Color()}} -> {Color(r=0, g=0, b=0)}    |  __hash__                               |
| Immutable            |  frozen + eq         |  Color().r = 10 -> TypeError                       |  __setattr__, __delattr__               |
|                      |                      |                                                    |                                         |
| Unpacking+           |  -                   |  r, g, b = Color()                                 |   __iter__                              |
| Optimization+        |  -                   |  sys.getsizeof(SlottedColor) -> 888                |  __slots__                              |
+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+

+These methods are not automatically generated and require manual implementation in a dataclass.

+这些方法不是自动生成的,需要在数据类中手动实现。

*__ne__is not needed and thus not implemented.

*__ne__不是必需的,因此没有实现



Additional features

附加的功能

Post-initialization

初始化后

@dataclasses.dataclass
class RGBA:
    r : int = 0
    g : int = 0
    b : int = 0
    a : float = 1.0

    def __post_init__(self):
        self.a : int =  int(self.a * 255)


RGBA(127, 0, 255, 0.5)
# RGBA(r=127, g=0, b=255, a=127)

Inheritance

遗产

@dataclasses.dataclass
class RGBA(Color):
    a : int = 0

Conversions

转化次数

Convert a dataclass to a tuple or a dict, recursively:

递归地将数据类转换为元组或字典:

>>> dataclasses.astuple(Color(128, 0, 255))
(128, 0, 255)
>>> dataclasses.asdict(Color(128, 0, 255))
{r: 128, g: 0, b: 255}

Limitations

限制



References

参考

  • R. Hettinger's talkon Dataclasses: The code generator to end all code generators
  • T. Hunner's talkon Easier Classes: Python Classes Without All the Cruft
  • Python's documentationon hashing details
  • Real Python's guideon The Ultimate Guide to Data Classes in Python 3.7
  • A. Shaw's blog poston A brief tour of Python 3.7 data classes
  • E. Smith's github repositoryon dataclasses
  • R.赫廷杰的谈话数据类:代码生成器来结束所有的代码生成器
  • T. Hunner关于更简单的类演讲:没有所有 Cruft 的 Python 类
  • Python关于散列细节的文档
  • 真正的Python的指导的终极指南数据类在Python 3.7
  • A. Shaw关于Python 3.7 数据类简介博客文章
  • E. Smith关于数据类github 存储库

回答by prosti

Consider this simple class Foo

考虑这个简单的类 Foo

from dataclasses import dataclass
@dataclass
class Foo:    
    def bar():
        pass  

Here is the dir()built-in comparison. On the left-hand side is the Foowithout the @dataclass decorator, and on the right is with the @dataclass decorator.

这是dir()内置的比较。左边是Foo没有@dataclass 装饰器的,右边是有@dataclass 装饰器的。

enter image description here

在此处输入图片说明

Here is another diff, after using the inspectmodule for comparison.

这是使用inspect模块进行比较后的另一个差异。

enter image description here

在此处输入图片说明

回答by Mahmoud Hanafy

From the PEP specification:

PEP 规范

A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, "Syntax for Variable Annotations". In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization, a repr, comparison methods, and optionally other methods as described in the Specification section. Such a class is called a Data Class, but there's really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given.

提供了一个类装饰器,它检查具有 PE​​P 526“变量注释语法”中定义的类型注释的变量的类定义。在本文档中,此类变量称为字段。使用这些字段,装饰器将生成的方法定义添加到类中以支持实例初始化、repr、比较方法和规范部分中描述的其他可选方法。这样的类称为数据类,但该类实际上并没有什么特别之处:装饰器将生成的方法添加到该类中,并返回与给定的相同的类。

The @dataclassgenerator adds methods to the class that you'd otherwise define yourself like __repr__, __init__, __lt__, and __gt__.

@dataclass发电机增加方法的类,否则你自己定义一样__repr____init____lt__,和__gt__