python python中1:1映射的数据结构?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/863935/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 21:00:23  来源:igfitidea点击:

A data-structure for 1:1 mappings in python?

pythondata-structures

提问by Salim Fadhley

I have a problem which requires a reversable 1:1 mapping of keys to values.

我有一个问题,需要将键以 1:1 可逆的方式映射到值。

That means sometimes I want to find the value given a key, but at other times I want to find the key given the value. Both keys and values are guaranteed unique.

这意味着有时我想找到给定键的值,但有时我想找到给定值的键。键和值都保证唯一。

x = D[y]
y == D.inverse[x]

The obvious solution is to simply invert the dictionary every time I want a reverse-lookup: Inverting a dictionary is very easy, there's a recipe here but for a large dictionary it can be very slow.

显而易见的解决方案是每次我想要反向查找时都简单地反转字典:反转字典非常容易,这里有一个食谱,但对于大型字典,它可能非常慢

The other alternative is to make a new class which unites two dictionaries, one for each kind of lookup. That would most likely be fast but would use up twice as much memory as a single dict.

另一种选择是创建一个新类,该类将两个字典联合起来,一个用于每种查找。这很可能会很快,但会消耗两倍于单个 dict 的内存。

So is there a better structure I can use?

那么我可以使用更好的结构吗?

  • My application requires that this should be very fast and use as little as possible memory.
  • The structure must be mutable, and it's strongly desirable that mutating the object should not cause it to be slower (e.g. to force a complete re-index)
  • We can guarantee that either the key or the value (or both) will be an integer
  • It's likely that the structure will be needed to store thousands or possibly millions of items.
  • Keys & Valus are guaranteed to be unique, i.e. len(set(x)) == len(x) for for x in [D.keys(), D.valuies()]
  • 我的应用程序要求这应该非常快并且使用尽可能少的内存。
  • 结构必须是可变的,并且强烈希望改变对象不应该导致它变慢(例如强制完全重新索引)
  • 我们可以保证键或值(或两者)都是整数
  • 很可能需要该结构来存储数千或数百万个项目。
  • Keys & Valus 保证是唯一的,即 len(set(x)) == len(x) for for x in [D.keys(), D.values()]

采纳答案by user17918

class TwoWay:
    def __init__(self):
       self.d = {}
    def add(self, k, v):
       self.d[k] = v
       self.d[v] = k
    def remove(self, k):
       self.d.pop(self.d.pop(k))
    def get(self, k):
       return self.d[k]

回答by nosklo

The other alternative is to make a new class which unites two dictionaries, one for each kind of lookup. That would most likely be fast but would use up twice as much memory as a single dict.

另一种选择是创建一个新类,该类将两个字典联合起来,一个用于每种查找。这很可能会很快,但会消耗两倍于单个 dict 的内存。

Not really. Have you measured that? Since both dictionaries would use references to the same objectsas keys and values, then the memory spent would be just the dictionary structure. That's a lot less than twiceand is a fixed ammount regardless of your data size.

并不真地。你测量过吗?由于两个字典都使用对相同对象的引用作为键和值,因此消耗的内存将只是字典结构。这远远少于两倍,并且无论您的数据大小如何,都是固定数量。

What I mean is that the actual data wouldn't be copied. So you'd spend little extra memory.

我的意思是实际数据不会被复制。所以你会花费很少的额外内存。

Example:

例子:

a = "some really really big text spending a lot of memory"

number_to_text = {1: a}
text_to_number = {a: 1}

Only a single copy of the "really big" string exists, so you end up spending just a little more memory. That's generally affordable.

只有一个“真正大”字符串的副本存在,所以你最终只需要多花一点内存。那一般是负担得起的。

I can't imagine a solution where you'd have the key lookup speed when looking by value, if you don't spend at leastenough memory to store a reverse lookup hash table (which is exactly what's being done in your "unite two dicts" solution).

如果您没有花费至少足够的内存来存储反向查找哈希表(这正是您的“联合两个dicts”解决方案)。

回答by Shane C. Mason

The other alternative is to make a new class which unites two dictionaries, one for each > kind of lookup. That would most likely use up twice as much memory as a single dict.

另一种选择是创建一个新类,该类将两个字典联合起来,一个用于 > 类型的查找。这很可能会使用单个 dict 两倍的内存。

Not really, since they would just be holding two references to the same data. In my mind, this is not a bad solution.

不是真的,因为他们只会持有对相同数据的两个引用。在我看来,这不是一个糟糕的解决方案。

Have you considered an in-memory database lookup? I am not sure how it will compare in speed, but lookups in relational databases can be veryfast.

您是否考虑过在内存中查找数据库?我不确定它会如何比较速度,但在关系数据库中查找可能会非常快。

回答by spenthil

Here is my own solution to this problem: http://github.com/spenthil/pymathmap/blob/master/pymathmap.py

这是我自己解决这个问题的方法:http: //github.com/spenthil/pymathmap/blob/master/pymathmap.py

The goal is to make it as transparent to the user as possible. The only introduced significant attribute is partner.

目标是使其对用户尽可能透明。唯一引入的重要属性是partner.

OneToOneDictsubclasses from dict- I know that isn't generally recommended, but I think I have the common use cases covered. The backend is pretty simple, it (dict1) keeps a weakref to a 'partner' OneToOneDict(dict2) which is its inverse. When dict1is modified dict2is updated accordingly as well and vice versa.

OneToOneDict来自的子类dict- 我知道通常不推荐这样做,但我认为我已经涵盖了常见的用例。后端非常简单,它 ( dict1) 保持一个对“伙伴” OneToOneDict( dict2)的弱引用,这是它的逆。当dict1被修改dict2时更新反之亦然相应以及和副。

From the docstring:

从文档字符串:

>>> dict1 = OneToOneDict()
>>> dict2 = OneToOneDict()
>>> dict1.partner = dict2
>>> assert(dict1 is dict2.partner)
>>> assert(dict2 is dict1.partner)
>>> dict1['one'] = '1'
>>> dict2['2'] = '1'
>>> dict1['one'] = 'wow'
>>> assert(dict1 == dict((v,k) for k,v in dict2.items()))
>>> dict1['one'] = '1'
>>> assert(dict1 == dict((v,k) for k,v in dict2.items()))
>>> dict1.update({'three': '3', 'four': '4'})
>>> assert(dict1 == dict((v,k) for k,v in dict2.items()))
>>> dict3 = OneToOneDict({'4':'four'})
>>> assert(dict3.partner is None)
>>> assert(dict3 == {'4':'four'})
>>> dict1.partner = dict3
>>> assert(dict1.partner is not dict2)
>>> assert(dict2.partner is None)
>>> assert(dict1.partner is dict3)
>>> assert(dict3.partner is dict1)
>>> dict1.setdefault('five', '5')
>>> dict1['five']
'5'
>>> dict1.setdefault('five', '0')
>>> dict1['five']
'5'

When I get some free time, I intend to make a version that doesn't store things twice. No clue when that'll be though :)

当我有空闲时间时,我打算制作一个不存储两次东西的版本。不知道什么时候会这样:)

回答by Pete Kirkham

Assuming that you have a key with which you look up a more complex mutable object, just make the key a property of that object. It does seem you might be better off thinking about the data model a bit.

假设您有一个用于查找更复杂的可变对象的键,只需将该键设为该对象的一个​​属性即可。似乎您最好稍微考虑一下数据模型。

回答by S.Lott

"We can guarantee that either the key or the value (or both) will be an integer"

“我们可以保证键或值(或两者)都是整数”

That's weirdly written -- "key or the value (or both)" doesn't feel right. Either they're all integers, or they're not all integers.

写的很奇怪——“键或值(或两者)”感觉不对。要么它们都是整数,要么它们不都是整数。

It sounds like they're all integers.

听起来它们都是整数。

Or, it sounds like you're thinking of replacing the target object with an integer value so you only have one copy referenced by an integer. This is a false economy. Just keep the target object. All Python objects are -- in effect -- references. Very little actual copying gets done.

或者,听起来您正在考虑用整数值替换目标对象,因此您只有一个由整数引用的副本。这是一种虚假的经济。只保留目标对象。所有 Python 对象实际上都是引用。很少有实际复制完成。

Let's pretend that you simply have two integers and can do a lookup on either one of the pair. One way to do this is to use heap queues or the bisect module to maintain ordered lists of integer key-value tuples.

让我们假设您只有两个整数,并且可以对这对整数中的任何一个进行查找。一种方法是使用堆队列或 bisect 模块来维护整数键值元组的有序列表。

See http://docs.python.org/library/heapq.html#module-heapq

http://docs.python.org/library/heapq.html#module-heapq

See http://docs.python.org/library/bisect.html#module-bisect

http://docs.python.org/library/bisect.html#module-bisect

You have one heapq (key,value)tuples. Or, if your underlying object is more complex, the (key,object) tuples.

你有一个 heapq(key,value)元组。或者,如果您的基础对象更复杂,则(key,object) 元组。

You have another heapq (value,key)tuples. Or, if your underlying object is more complex, (otherkey,object)tuples.

你有另一个 heapq(value,key)元组。或者,如果您的基础对象更复杂,则使用(otherkey,object)元组。

An "insert" becomes two inserts, one to each heapq-structured list.

一个“插入”变成了两个插入,一个插入到每个 heapq 结构的列表中。

A key lookup is in one queue; a value lookup is in the other queue. Do the lookups using bisect(list,item).

一个键查找在一个队列中;值查找在另一个队列中。使用bisect(list,item).

回答by David Berger

It so happens that I find myself asking this question all the time (yesterday in particular). I agree with the approach of making two dictionaries. Do some benchmarking to see how much memory it's taking. I've never needed to make it mutable, but here's how I abstract it, if it's of any use:

碰巧我发现自己一直在问这个问题(尤其是昨天)。我同意制作两本词典的方法。做一些基准测试,看看它占用了多少内存。我从来不需要让它可变,但这是我如何抽象它,如果它有任何用处:

class BiDict(list):
    def __init__(self,*pairs):
        super(list,self).__init__(pairs)
        self._first_access = {}
        self._second_access = {}
        for pair in pairs:
            self._first_access[pair[0]] = pair[1]
            self._second_access[pair[1]] = pair[0]
            self.append(pair)

    def _get_by_first(self,key):
        return self._first_access[key]

    def _get_by_second(self,key):
        return self._second_access[key]

    # You'll have to do some overrides to make it mutable
    # Methods such as append, __add__, __del__, __iadd__
    # to name a few will have to maintain ._*_access

class Constants(BiDict):
    # An implementation expecting an integer and a string
    get_by_name = BiDict._get_by_second
    get_by_number = BiDict._get_by_first

t = Constants(
        ( 1, 'foo'),
        ( 5, 'bar'),
        ( 8, 'baz'),
    )

>>> print t.get_by_number(5)
bar
>>> print t.get_by_name('baz')
8
>>> print t
[(1, 'foo'), (5, 'bar'), (8, 'baz')]

回答by ShawnMilo

How about using sqlite? Just create a :memory: database with a two-column table. You can even add indexes, then query by either one. Wrap it in a class if it's something you're going to use a lot.

使用sqlite怎么样?只需创建一个带有两列表的 :memory: 数据库。您甚至可以添加索引,然后通过任一索引进行查询。如果它是您要经常使用的东西,请将其包装在一个类中。