以整数为键的 Python dict 会自然排序吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20388130/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:19:19  来源:igfitidea点击:

Will a Python dict with integers as keys be naturally sorted?

pythondictionaryintegerkeysorted

提问by user3067927

If I create a Python dict which uses integers as keys, can I safely assume that iterating over the dict will retrieve items in order according to key value?

如果我创建一个使用整数作为键的 Python dict,我可以安全地假设遍历 dict 将根据键值按顺序检索项目吗?

i.e. will

即会

my_dict = {}
for x in range(0,100):
  my_dict[x] = str(x)

for item in my_dict.items():
  print item

always result in printing the list in order of key value?

总是导致按键值的顺序打印列表?

采纳答案by Silas Ray

In short, no. I'm betting you noted that dictionaries use the hashes of keys as indexes in to an array, and since ints hash to their own values, you inferred that inserted values would end up in order by key if their keys are integers. While the first 2 parts of that statement are true, the inference is not, even as an undocumented side effect. The dict keys are derivedfrom the hashes of the keys, but are not the complete hashes. This means even with integer keys, you can still get out of order inserts since 2 values could collide at the same location (or even have "out of order" hash-derived values) and thus end up inserting the keys out of order in the dict.

简而言之,没有。我敢打赌你注意到字典使用键的散列作为数组的索引,并且由于 int 散列到它们自己的值,你推断如果插入的值是整数,则插入的值将按键顺序结束。虽然该陈述的前 2 部分是正确的,但推论并非如此,即使作为未记录的副作用也是如此。dict 键是从键的散列派生出来的,但不是完整的散列。这意味着即使使用整数键,您仍然可能会出现乱序插入,因为 2 个值可能会在同一位置发生冲突(或者甚至具有“乱序”哈希派生值),因此最终会乱序插入键字典。

Basically, think of it as the index in the internal storage array of the dict being some number of low order bits from the key's hash. Just because one number is larger than another doesn't mean that a value built from it's truncated low order bits is going to be larger, or even different.

基本上,可以将其视为 dict 的内部存储数组中的索引,它是来自密钥散列的一些低阶位。仅仅因为一个数字大于另一个数字并不意味着从它被截断的低阶位构建的值会更大,甚至不同。

回答by Ignacio Vazquez-Abrams

No, you cannot.Alwayssort if you want to iterate in an ordered fashion.

你不能。如果您想以有序的方式迭代,请始终排序。

回答by fuesika

I don't think so. You have to make use of collections.OrderedDictin order to ensure ordering. However, this will sort the entries in the order they were added.

我不这么认为。您必须使用collections.OrderedDict以确保订购。但是,这将按照添加的顺序对条目进行排序。

回答by qmorgan

No, Python dictionaries do not have inherent ordering, regardless of the key values. If you need ordering, stick to arrays or lists, or better yet - check out pandas, which will allow a similar ability to dictionaries to call by key value, as well as many other powerful features (http://pandas.pydata.org/pandas-docs/stable/10min.html).

不,无论键值如何,Python 词典都没有固有的顺序。如果您需要排序,坚持使用数组或列表,或者更好 - 检查pandas,这将允许类似于字典的按键值调用的能力,以及许多其他强大的功能(http://pandas.pydata.org/ pandas-docs/stable/10min.html)。

回答by dstromberg

Python dictionaries are not ordered in any meaningful way; they are hash tables.

Python 字典没有以任何有意义的方式排序;它们是哈希表。

Python comes with collections.OrderedDict, but this sorts in order of insertion, not order of key.

Python 带有 collections.OrderedDict,但这是按插入顺序而不是键顺序排序的。

Here are two dictionary-like modules that sort by keys:

这是两个按键排序的类似字典的模块:

https://pypi.python.org/pypi/treap/

https://pypi.python.org/pypi/treap/

https://pypi.python.org/pypi/red-black-tree-mod/

https://pypi.python.org/pypi/red-black-tree-mod/

Some say that treaps are faster on average than red-black trees but red-black trees have a lower standard deviation in operation times. Others question this, though in my tests the former proved true.

有人说 treap 平均比红黑树快,但红黑树的操作时间标准差较低。其他人对此提出质疑,但在我的测试中,前者被证明是正确的。

Both treaps and red-black trees do almost everything in O(logn) time, but keep their keys in order constantly. Python dictionaries are O(1) for most operations. However, getting all keys in order is O(n) for treaps and red-black trees, while it's O(nlogn) for dictionaries.

treaps 和红黑树几乎都在 O(logn) 时间内完成了所有工作,但始终保持它们的密钥有序。对于大多数操作,Python 字典的复杂度为 O(1)。但是,按顺序获取所有键对于 treaps 和红黑树是 O(n),而对于字典则是 O(nlogn)。

When should you use which?

什么时候应该使用哪个?

  1. If you're sorting in a loop, you're probably better off with a treap or red-black tree.
  2. If you're sorting once at the end of your program or something, you're probably better off with list_ = list(dict_); list_.sort()
  3. If you're preserving the order of your inputs, like from a config file or something, you're probably best off with OrderedDict.
  1. 如果您在循环中排序,则最好使用 treap 或红黑树。
  2. 如果你在程序结束时排序一次,那么你可能最好使用 list_ = list(dict_); list_.sort()
  3. 如果您要保留输入的顺序,例如来自配置文件或其他内容,则最好使用 OrderedDict。

HTH

HTH

回答by benas

I'm quite late to the party, but if, like me, you've stumbled upon this page via [your favorite search engine], I'd like to be the one to give you the good news:

我参加派对已经很晚了,但是如果你像我一样,通过 [你最喜欢的搜索引擎] 偶然发现了这个页面,我想成为那个给你好消息的人:

While Python dicts will never be naturally sorted, it's trivial to use them as if they are. Assuming that your keys are, in fact, integers, simply pass your favorite dict, D, to the sortedbuilt-in like so:

虽然 Python dicts 永远不会自然排序,但使用它们是微不足道的。假设您的键实际上是整数,只需将您最喜欢的 dict, D, 传递给sorted内置函数,如下所示:

for index, item in sorted(D.items()):
    print("index:", index, "item:", item)

D.items()returns a class dict_itemswith an __iter__method which, when called, as by the foror instatements, returns an iterator, yielding key-value pairs, which can be iterated over like any other iterable.

D.items()返回一个dict_items带有__iter__方法的类,当被fororin语句调用时,返回一个迭代器,产生键值对,可以像任何其他可迭代对象一样迭代。

sortedtakes the iterator and returns a list, so if D = {1: "alpha", 3: "charlie", 2: "bravo"}, then what is returned by sortedis the sorted list [(1, "alpha"), (2, "bravo"), (3, "charlie")].

sorted接受迭代器并返回一个列表,所以如果D = {1: "alpha", 3: "charlie", 2: "bravo"},那么返回的sorted是排序列表[(1, "alpha"), (2, "bravo"), (3, "charlie")]

It's also possible to sort by a specific element:

也可以按特定元素排序:

sorted(D.items(), key=lambda x: x[1])

Or by some other, arbitrary, even nondeterministic, sorting criterion:

或者通过其他一些任意的,甚至不确定的排序标准:

sorted(D.items(), lambda _: random.randint(0, 100))

The construction of the list from the dict is an operation O(n)in time and space, and Python's sorting algorithm, Timsort, is very efficient (O(nlog n)in the average case), so in the vast majority of real-world use cases, runtime performance isn't something worth worrying about.

从 dict 构造列表是一个时间和空间上的O(n)操作,而 Python 的排序算法 Timsort 非常高效(一般情况下为O(nlog n)),所以在绝大多数实数中-world 用例,运行时性能不值得担心。