Python 如何计算嵌套字典中的所有元素？

Question

提问by jamieb

How do I count the number of subelements in a nested dictionary in the most efficient manner possible? The len() function doesn't work as I initially expected it to:

如何以最有效的方式计算嵌套字典中子元素的数量？len() 函数无法像我最初预期的那样工作：

>>> food_colors = {'fruit': {'orange': 'orange', 'apple': 'red', 'banana': 'yellow'}, 'vegetables': {'lettuce': 'green', 'beet': 'red', 'pumpkin': 'orange'}}
>>> len(food_colors)
2
>>>

What if I actually want to count the number of subelements? (e.g., expected result to be "6") Is there a better way to do this rather than looping through each element and summing the qty of subelements? In this particular application, I have about five million subelements to count and every clock cycle counts.

如果我真的想计算子元素的数量怎么办？（例如，预期结果为“6”）有没有更好的方法来做到这一点，而不是循环遍历每个元素并对子元素的数量求和？在这个特定的应用程序中，我有大约 500 万个子元素要计算，每个时钟周期都计算在内。

Answer 1

采纳答案by zwol

Is it guaranteed that each top-level key has a dictionary as its value, and that no second-level key has a dictionary? If so, this will go as fast as you can hope for:

是否保证每个顶级键都有一个字典作为它的值，并且没有一个二级键有一个字典？如果是这样，这将与您希望的一样快：

sum(len(v) for v in food_colors.itervalues())

If the data structure is more complicated, it will need more code, of course. I'm not aware of any intrinsics to do deep data structure walks.

如果数据结构更复杂，当然需要更多的代码。我不知道进行深度数据结构遍历的任何内在函数。

Answer 2

回答by threenplusone

sum(len(x) for x in food_colors.values())

Answer 3

回答by carl

Do you only want the immediate children? If so, this is probably the best:

你只想要直接的孩子吗？如果是这样，这可能是最好的：

sum(len(x) for x in fc.values())

Answer 4

回答by eichin

The subelements are distinct objects, there's no other relationship to use that will be fundamentally faster than iterating over them - though there are lots of ways to do that (using map, or .values(), for example) that will vary in performance, enough that you'll probably want to use timeitto compare them.

子元素是不同的对象，没有其他关系可以使用从根本上比迭代它们更快 - 尽管有很多方法可以做到这一点（例如使用map, 或.values()），它们的性能会有所不同，足以让您可能想用来timeit比较它们。

If counting them is important to your application, consider doing some things to make that easier:

如果计算它们对您的应用程序很重要，请考虑做一些事情以使其更容易：

count them as you build the data structure
instead of nested dicts, consider an in-memory sqlitetable, using connect(":memory:")(this might slow down other operations, or make them more complex, but the trade-off is worth considering.)

在构建数据结构时计算它们
而不是嵌套dicts，考虑一个内存sqlite表，使用connect(":memory:")（这可能会减慢其他操作，或者使它们更复杂，但权衡值得考虑。）

Answer 5

回答by bobzsj87

c = sum([len(i) for i in fruit_colors.values() ])

Answer 6

回答by dawg

For your specific question, you can just use this:

对于您的具体问题，您可以使用以下命令：

>>> d={'fruit': 
         {'orange': 'orange', 'apple': 'red', 'banana': 'yellow'}, 
       'vegetables': 
         {'lettuce': 'green', 'beet': 'red', 'pumpkin': 'orange'}}
>>> len(d)
2            # that is 1 reference for 'fruit' and 1 for 'vegetables'
>>> len(d['fruit'])
3            # 3 fruits listed...
>>> len(d['vegetables'])
3            # you thought of three of those...
>>> len(d['fruit'])+len(d['vegetables'])
6

While you can use the various tools that Python has to count the elements in this trivial dictionary, the more interesting and productive thing is to think about the structure of the data in the first place.

虽然您可以使用 Python 拥有的各种工具来计算这个简单字典中的元素，但更有趣和更有成效的事情是首先考虑数据的结构。

The basic data structures of Python are lists, sets, tuples, and dictionaries. Any of these data structures can 'hold', by reference, any nested version of itself or the other data structures.

Python 的基本数据结构是列表、集合、元组和字典。这些数据结构中的任何一个都可以通过引用“保存”其自身或其他数据结构的任何嵌套版本。

This list is a nested list:

这个列表是一个嵌套列表：

>>> l = [1, [2, 3, [4]], [5, 6]]
>>> len(l)
3
>>> l[0]
1
>>> l[1]
[2, 3, [4]]
>>> l[2]
[5, 6]

The first element is the integer 1. Elements 1 and 2 are lists themselves. The same can be true of any other of the basic Python data structures. These are recursive data structures. You can print them with pprint

第一个元素是整数 1。元素 1 和 2 本身就是列表。任何其他基本 Python 数据结构也是如此。这些是递归数据结构。你可以用pprint打印它们

If you organize your dictionary a bit better, it is easier to extract information from it with Python's simplest tools:

如果你更好地组织你的字典，使用 Python 最简单的工具从中提取信息会更容易：

>>> color='color'
>>> family='family'
>>> sensation='sensation'
>>> good_things={   
            'fruit': 
            {
                'orange': 
                    {
                    color: 'orange', 
                    family: 'citrus',
                    sensation: 'juicy'
                    }, 
                'apple': 
                    {
                    color: ['red','green','yellow'], 
                    family:'Rosaceae',
                    'sensation': 'woody'
                    },
                'banana': 
                    {
                    color: ['yellow', 'green'],
                    family: 'musa',
                    sensation: 'sweet'
                    }
            },
            'vegatables': 
            {
                'beets': 
                    {
                    color: ['red', 'yellow'],
                    family: 'Chenopodiaceae',
                    sensation: 'sweet'
                    },
                'broccoli':
                    {
                    color: 'green',
                    family: 'kale',
                    sensation: 'The butter you put on it',
                    }
            }
        }

Now the queries against that data make more sense:

现在针对该数据的查询更有意义：

>>> len(good_things)
2                        # 2 groups: fruits and vegetables
>>> len(good_things['fruit'])
3                        # three fruits cataloged
>>> len(good_things['vegetables'])
2                        # I can only think of two vegetables...
>>> print good_things['fruit']['apple']
{'color': ['red', 'green', 'yellow'], 'sensation': 'woody', 'family': 'Rosaceae'}
>>> len(good_things['fruit']['apple']['color'])
3                        # apples have 3 colors

Answer 7

回答by Tim McDonald

You could do this with a recursive function.

你可以用递归函数来做到这一点。

>>> x
{'a': 1, 'b': 2, 'c': 3, 'd': {'I': 1, 'II': 2, 'III': 3}, 'e': 5}
>>> def test(d):
...   cnt = 0
...   for e in d:
...     if type(d[e]) is dict:
...       cnt += test(d[e])
...     else:
...       cnt += 1
...   return cnt
...
>>> test(x)
7

Answer 8

回答by Dodgie

For arbitrary depth nested dictionaries:

对于任意深度的嵌套字典：

def num_elements(x):
  if isinstance(x, dict):
    return sum([num_elements(_x) for _x in x.values()])
  else: return 1

Answer 9

回答by manecosta

Arbitrary depth, one liner:

任意深度，一个班轮：

def count(d):
    return sum([count(v) if isinstance(v, dict) else 1 for v in d.values()])

Python 如何计算嵌套字典中的所有元素？

提问by jamieb

采纳答案by zwol

回答by threenplusone

回答by carl

回答by eichin

回答by bobzsj87

回答by dawg

回答by Tim McDonald

回答by Dodgie

回答by manecosta

相关推荐

最近更新

标签

Python 如何计算嵌套字典中的所有元素？

提问by jamieb

采纳答案by zwol

回答by threenplusone

回答by carl

回答by eichin

回答by bobzsj87

回答by dawg

回答by Tim McDonald

回答by Dodgie

回答by manecosta

相关推荐

适用于 Windows 的 NumPy python 2.7

Python 中的“public”或“private”属性？什么是最好的方法？

如何使我的 python 脚本易于移植？或者如何编译成具有所有模块依赖项的二进制文件？

带种子的 Python 随机序列

相关推荐

最近更新

标签