字典是否在 Python 3.6+ 中排序?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39980323/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:01:38  来源:igfitidea点击:

Are dictionaries ordered in Python 3.6+?

pythonpython-3.xdictionarypython-internalspython-3.6

提问by Chris_Rands

Dictionaries are ordered in Python 3.6 (under the CPython implementation at least) unlike in previous incarnations. This seems like a substantial change, but it's only a short paragraph in the documentation. It is described as a CPython implementation detail rather than a language feature, but also implies this may become standard in the future.

字典在 Python 3.6 中排序(至少在 CPython 实现下)与以前的化身不同。这看起来是一个实质性的变化,但它只是文档中的一小段。它被描述为 CPython 实现细节而不是语言特性,但也暗示这可能在未来成为标准。

How does the new dictionary implementation perform better than the older one while preserving element order?

在保留元素顺序的同时,新的字典实现如何比旧的实现更好?

Here is the text from the documentation:

以下是文档中的文本:

dict()now uses a “compact” representation pioneered by PyPy. The memory usage of the new dict() is between 20% and 25% smaller compared to Python 3.5. PEP 468(Preserving the order of **kwargs in a function.) is implemented by this. The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5). (Contributed by INADA Naoki in issue 27350. Idea originally suggested by Raymond Hettinger.)

dict()现在使用PyPy 首创的“紧凑”表示。与 Python 3.5 相比,新 dict() 的内存使用量减少了 20% 到 25%。PEP 468(在函数中保留 **kwargs 的顺序。)由此实现。这个新实现的顺序保留方面被认为是一个实现细节,不应依赖(这可能会在未来发生变化,但在更改语言规范之前,希望在几个版本的语言中使用这个新的 dict 实现为所有当前和未来的 Python 实现强制要求保留顺序的语义;这也有助于保持与旧版本的语言的向后兼容性,其中随机迭代顺序仍然有效,例如 Python 3.5)。(由稻田直树提供)问题 27350最初由 Raymond Hettinger 提出的想法。)

Update December 2017: dicts retaining insertion order is guaranteedfor Python 3.7

2017 年 12 月更新:保证Python 3.7dict保留插入顺序

回答by Dimitris Fasarakis Hilliard

Are dictionaries ordered in Python 3.6+?

字典是否在 Python 3.6+ 中排序?

They are insertion ordered[1]. As of Python 3.6, for the CPython implementation of Python, dictionaries remember the order of items inserted. This is considered an implementation detail in Python 3.6; you need to use OrderedDictif you want insertion ordering that's guaranteedacross other implementations of Python (and other ordered behavior[1]).

它们是插入顺序[1]。从 Python 3.6 开始,对于 Python 的 CPython 实现,字典会记住插入项的顺序这被认为是 Python 3.6 中的一个实现细节OrderedDict如果您想要在 Python 的其他实现(以及其他有序行为[1])中保证插入排序,则需要使用。

As of Python 3.7, this is no longer an implementation detail and instead becomes a language feature. From a python-dev message by GvR:

从 Python 3.7 开始,这不再是一个实现细节,而是成为一种语言特性。来自 GvR 的 python-dev 消息

Make it so. "Dict keeps insertion order" is the ruling. Thanks!

让它如此。“字典保持插入顺序”是裁决。谢谢!

This simply means that you can depend on it. Other implementations of Python must also offer an insertion ordered dictionary if they wish to be a conforming implementation of Python 3.7.

这只是意味着您可以依赖它。如果 Python 的其他实现希望成为 Python 3.7 的一致实现,则它们还必须提供插入顺序字典。



How does the Python 3.6dictionary implementation perform better[2]than the older one while preserving element order?

Python3.6字典实现如何在保留元素顺序的同时比旧的实现更好[2]

Essentially, by keeping two arrays.

本质上,通过保留两个数组

  • The first array, dk_entries, holds the entries (of type PyDictKeyEntry) for the dictionary in the order that they were inserted. Preserving order is achieved by this being an append only array where new items are always inserted at the end (insertion order).

  • The second, dk_indices, holds the indices for the dk_entriesarray (that is, values that indicate the position of the corresponding entry in dk_entries). This array acts as the hash table. When a key is hashed it leads to one of the indices stored in dk_indicesand the corresponding entry is fetched by indexing dk_entries. Since only indices are kept, the type of this array depends on the overall size of the dictionary (ranging from type int8_t(1byte) to int32_t/int64_t(4/8bytes) on 32/64bit builds)

  • 第一个数组 ,按插入顺序dk_entries保存字典的条目(类型PyDictKeyEntry)。保留顺序是通过这是一个仅附加数组来实现的,其中新项目总是插入在末尾(插入顺序)。

  • 第二个 ,dk_indices保存dk_entries数组的索引(即指示 中相应条目位置的值dk_entries)。该数组充当哈希表。当一个键被散列时,它会导致存储在其中的一个索引,dk_indices并通过 indexing 获取相应的条目dk_entries。由于仅保留索引,因此该数组的类型取决于字典的整体大小(范围从类型int8_t( 1byte) 到int32_t/ int64_t( 4/ 8bytes) on 32/ 64bit builds)

In the previous implementation, a sparse array of type PyDictKeyEntryand size dk_sizehad to be allocated; unfortunately, it also resulted in a lot of empty space since that array was not allowed to be more than 2/3 * dk_sizefull for performance reasons. (and the empty space stillhad PyDictKeyEntrysize!).

在之前的实现中,必须分配一个类型PyDictKeyEntry和大小的稀疏数组dk_size;不幸的是,它也导致了大量的空白空间,因为出于性能原因,该数组不允许超过2/3 * dk_size满。(而且空的空间仍然有大小!)。PyDictKeyEntry

This is not the case now since only the requiredentries are stored (those that have been inserted) and a sparse array of type intX_t(Xdepending on dict size) 2/3 * dk_sizes full is kept. The empty space changed from type PyDictKeyEntryto intX_t.

现在情况并非如此,因为只存储了所需的条目(那些已插入的条目)并且保留了类型intX_tX取决于 dict 大小)2/3 * dk_sizes full的稀疏数组。空白区域从类型更改PyDictKeyEntryintX_t

So, obviously, creating a sparse array of type PyDictKeyEntryis much more memory demanding than a sparse array for storing ints.

因此,显然,创建一个稀疏类型数组PyDictKeyEntry比用于存储ints的稀疏数组需要更多内存。

You can see the full conversation on Python-Devregarding this feature if interested, it is a good read.

如果有兴趣,您可以在 Python-Dev 上查看有关此功能的完整对话,这是一本很好的读物。



In the original proposal made by Raymond Hettinger, a visualization of the data structures used can be seen which captures the gist of the idea.

在 Raymond Hettinger 提出的最初提案中,可以看到所用数据结构的可视化,它抓住了这个想法的要点。

For example, the dictionary:

d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}

is currently stored as [keyhash, key, value]:

entries = [['--', '--', '--'],
           [-8522787127447073495, 'barry', 'green'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           [-9092791511155847987, 'timmy', 'red'],
           ['--', '--', '--'],
           [-6480567542315338377, 'guido', 'blue']]

Instead, the data should be organized as follows:

indices =  [None, 1, None, None, None, 0, None, 2]
entries =  [[-9092791511155847987, 'timmy', 'red'],
            [-8522787127447073495, 'barry', 'green'],
            [-6480567542315338377, 'guido', 'blue']]

例如字典:

d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}

当前存储为 [keyhash, key, value]:

entries = [['--', '--', '--'],
           [-8522787127447073495, 'barry', 'green'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           [-9092791511155847987, 'timmy', 'red'],
           ['--', '--', '--'],
           [-6480567542315338377, 'guido', 'blue']]

相反,数据应按如下方式组织:

indices =  [None, 1, None, None, None, 0, None, 2]
entries =  [[-9092791511155847987, 'timmy', 'red'],
            [-8522787127447073495, 'barry', 'green'],
            [-6480567542315338377, 'guido', 'blue']]

As you can visually now see, in the original proposal, a lot of space is essentially empty to reduce collisions and make look-ups faster. With the new approach, you reduce the memory required by moving the sparseness where it's really required, in the indices.

正如您现在可以直观地看到的那样,在最初的提案中,很多空间基本上是空的,以减少冲突并加快查找速度。使用新方法,您可以通过在索引中将稀疏移动到真正需要的地方来减少所需的内存。



[1]: I say "insertion ordered" and not "ordered" since, with the existence of OrderedDict, "ordered" suggests further behavior that the dictobject doesn't provide. OrderedDicts are reversible, provide order sensitive methods and, mainly, provide an order-sensive equality tests (==, !=). dicts currently don't offer any of those behaviors/methods.

[1]:我说“插入有序”而不是“有序”,因为随着 OrderedDict 的存在,“有序”暗示了dict对象不提供的进一步行为。OrderedDicts 是可逆的,提供对顺序敏感的方法,并且主要提供对顺序敏感的相等性测试 ( ==, !=)。dict目前不提供任何这些行为/方法。



[2]: The new dictionary implementations performs better memory wiseby being designed more compactly; that's the main benefit here. Speed wise, the difference isn't so drastic, there's places where the new dict might introduce slight regressions (key-lookups, for example) while in others (iteration and resizing come to mind) a performance boost should be present.

[2]:新的字典实现通过更紧凑的设计来执行更好的内存;这是这里的主要好处。在速度方面,差异并不是那么大,有些地方新的 dict 可能会引入轻微的回归(例如键查找),而在其他地方(迭代和调整大小)应该存在性能提升。

Overall, the performance of the dictionary, especially in real-life situations, improves due to the compactness introduced.

总体而言,由于引入了紧凑性,字典的性能,尤其是在现实生活中的性能有所提高。

回答by Maresh

Below is answering the original first question:

以下是对最初的第一个问题的回答:

Should I use dictor OrderedDictin Python 3.6?

我应该在 Python 3.6 中使用dictOrderedDict吗?

I think this sentence from the documentation is actually enough to answer your question

我认为文档中的这句话实际上足以回答您的问题

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon

这个新实现的顺序保留方面被认为是一个实现细节,不应依赖

dictis not explicitly meant to be an ordered collection, so if you want to stay consistent and not rely on a side effect of the new implementation you should stick with OrderedDict.

dict并不明确表示是有序集合,因此如果您想保持一致并且不依赖新实现的副作用,您应该坚持使用OrderedDict.

Make your code future proof :)

让您的代码面向未来:)

There's a debate about that here.

有关于辩论在这里

EDIT: Python 3.7 will keep this as a featuresee

编辑:Python 3.7 将把它作为一个特性查看

回答by fjsj

Update: Guido van Rossum announced on the mailing listthat as of Python 3.7 dicts in all Python implementations must preserve insertion order.

更新:Guido van Rossum在邮件列表宣布,从Python 3.7 开始dict,所有 Python 实现都必须保留插入顺序。

回答by rkengler

I wanted to add to the discussion above but don't have the reputation to comment.

我想添加到上面的讨论中,但没有评论的声誉。

Python 3.8 is not quite released yet, but it will even include the reversed()function on dictionaries (removing another difference from OrderedDict.

Python 3.8 还没有完全发布,但它甚至会包含reversed()字典中的函数(从OrderedDict.

Dict and dictviews are now iterable in reversed insertion order using reversed(). (Contributed by Rémi Lapeyre in bpo-33462.) See what's new in python 3.8

dict 和 dictviews 现在可以使用 reversed() 以相反的插入顺序迭代。(由 Rémi Lapeyre 在 bpo-33462 中贡献。) 查看 python 3.8 中的新功能

I don't see any mention of the equality operator or other features of OrderedDictso they are still not entirely the same.

我没有看到任何提及相等运算符或其他功能的内容,OrderedDict因此它们仍然不完全相同。