在 Python 中表示图（数据结构）

Question

提问by shad0w_wa1k3r

How can one neatly represent a graphin Python? (Starting from scratch i.e. no libraries!)
What data structure (e.g. dicts/tuples/dict(tuples)) will be fast but also memory efficient?
One must be able to do various graph operationson it.

As pointed out, the various graph representationsmight help. How does one go about implementing them in Python?

As for the libraries, this questionhas quite good answers.

如何在Python 中巧妙地表示图形？（从头开始，即没有库！）什么数据结构（例如 dicts/tuples/dict(tuples)）既快速又内存高效？必须能够对其进行各种图形操作。正如所指出的，各种图形表示可能会有所帮助。如何在 Python 中实现它们？至于图书馆，这个问题有很好的答案。

Answer 1

采纳答案by mVChr

Even though this is a somewhat old question, I thought I'd give a practical answer for anyone stumbling across this.

尽管这是一个有点老的问题，但我想我会为任何遇到这个问题的人提供一个实用的答案。

Let's say you get your input data for your connections as a list of tuples like so:

假设您将连接的输入数据作为元组列表获取，如下所示：

[('A', 'B'), ('B', 'C'), ('B', 'D'), ('C', 'D'), ('E', 'F'), ('F', 'C')]

The data structure I've found to be most useful and efficient for graphs in Python is a dict of sets. This will be the underlying structure for our Graphclass. You also have to know if these connections are arcs (directed, connect one way) or edges (undirected, connect both ways). We'll handle that by adding a directedparameter to the Graph.__init__method. We'll also add some other helpful methods.

我发现对 Python 中的图形最有用和最有效的数据结构是集合的字典。这将是我们Graph班级的基础结构。您还必须知道这些连接是弧线（有向，单向连接）还是边（无向，双向连接）。我们将通过向方法添加directed参数来处理它Graph.__init__。我们还将添加一些其他有用的方法。

import pprint
from collections import defaultdict


class Graph(object):
    """ Graph data structure, undirected by default. """

    def __init__(self, connections, directed=False):
        self._graph = defaultdict(set)
        self._directed = directed
        self.add_connections(connections)

    def add_connections(self, connections):
        """ Add connections (list of tuple pairs) to graph """

        for node1, node2 in connections:
            self.add(node1, node2)

    def add(self, node1, node2):
        """ Add connection between node1 and node2 """

        self._graph[node1].add(node2)
        if not self._directed:
            self._graph[node2].add(node1)

    def remove(self, node):
        """ Remove all references to node """

        for n, cxns in self._graph.items():  # python3: items(); python2: iteritems()
            try:
                cxns.remove(node)
            except KeyError:
                pass
        try:
            del self._graph[node]
        except KeyError:
            pass

    def is_connected(self, node1, node2):
        """ Is node1 directly connected to node2 """

        return node1 in self._graph and node2 in self._graph[node1]

    def find_path(self, node1, node2, path=[]):
        """ Find any path between node1 and node2 (may not be shortest) """

        path = path + [node1]
        if node1 == node2:
            return path
        if node1 not in self._graph:
            return None
        for node in self._graph[node1]:
            if node not in path:
                new_path = self.find_path(node, node2, path)
                if new_path:
                    return new_path
        return None

    def __str__(self):
        return '{}({})'.format(self.__class__.__name__, dict(self._graph))

I'll leave it as an "exercise for the reader" to create a find_shortest_pathand other methods.

我将把它作为“读者练习”来创建 afind_shortest_path和其他方法。

Let's see this in action though...

不过，让我们看看它的实际效果......

>>> connections = [('A', 'B'), ('B', 'C'), ('B', 'D'),
                   ('C', 'D'), ('E', 'F'), ('F', 'C')]
>>> g = Graph(connections, directed=True)
>>> pretty_print = pprint.PrettyPrinter()
>>> pretty_print.pprint(g._graph)
{'A': {'B'},
 'B': {'D', 'C'},
 'C': {'D'},
 'E': {'F'},
 'F': {'C'}}

>>> g = Graph(connections)  # undirected
>>> pretty_print = pprint.PrettyPrinter()
>>> pretty_print.pprint(g._graph)
{'A': {'B'},
 'B': {'D', 'A', 'C'},
 'C': {'D', 'F', 'B'},
 'D': {'C', 'B'},
 'E': {'F'},
 'F': {'E', 'C'}}

>>> g.add('E', 'D')
>>> pretty_print.pprint(g._graph)
{'A': {'B'},
 'B': {'D', 'A', 'C'},
 'C': {'D', 'F', 'B'},
 'D': {'C', 'E', 'B'},
 'E': {'D', 'F'},
 'F': {'E', 'C'}}

>>> g.remove('A')
>>> pretty_print.pprint(g._graph)
{'B': {'D', 'C'},
 'C': {'D', 'F', 'B'},
 'D': {'C', 'E', 'B'},
 'E': {'D', 'F'},
 'F': {'E', 'C'}}

>>> g.add('G', 'B')
>>> pretty_print.pprint(g._graph)
{'B': {'D', 'G', 'C'},
 'C': {'D', 'F', 'B'},
 'D': {'C', 'E', 'B'},
 'E': {'D', 'F'},
 'F': {'E', 'C'},
 'G': {'B'}}

>>> g.find_path('G', 'E')
['G', 'B', 'D', 'C', 'F', 'E']

Answer 2

回答by jterrace

NetworkXis an awesome Python graph library. You'll be hard pressed to find something you need that it doesn't already do.

NetworkX是一个很棒的 Python 图形库。你会很难找到你需要的东西，但它没有做。

And it's open source so you can see how they implemented their algorithms. You can also add additional algorithms.

而且它是开源的，所以你可以看到他们是如何实现他们的算法的。您还可以添加其他算法。

https://github.com/networkx/networkx/tree/master/networkx/algorithms

Answer 3

回答by pepr

First, the choice of classical listvs. matrixrepresentations depends on the purpose (on what do you want to do with the representation). The well-known problems and algorithms are related to the choice. The choice of the abstract representation kind of dictates how it should be implemented.

首先，经典列表与矩阵表示的选择取决于目的（取决于您想对表示做什么）。众所周知的问题和算法都与选择有关。抽象表示类型的选择决定了它应该如何实现。

Second, the question is whether the vertices and edges should be expressed only in terms of existence, or whether they carry some extra information.

其次，问题是顶点和边是否应该只用存在来表达，或者它们是否携带一些额外的信息。

From Python built-in data types point-of-view, any value contained elsewhere is expressed as a (hidden) reference to the target object. If it is a variable (i.e. named reference), then the name and the reference is always stored in (an internal) dictionary. If you do not need names, then the reference can be stored in your own container -- here probably Python listwill always be used for the listas abstraction.

从 Python 内置数据类型的角度来看，其他地方包含的任何值都表示为对目标对象的（隐藏）引用。如果它是一个变量（即命名引用），则名称和引用始终存储在（内部）字典中。如果您不需要名称，则可以将引用存储在您自己的容器中——这里可能Python 列表将始终用于列表作为抽象。

Python list is implemented as a dynamic array of references, Python tuple is implemented as static array of references with constant content (the value of references cannot be changed). Because of that they can be easily indexed. This way, the list can be used also for implementation of matrices.

Python list 实现为动态引用数组，Python tuple 实现为具有常量内容的静态引用数组（引用的值不能更改）。因此，它们可以很容易地被索引。这样，列表也可以用于矩阵的实现。

Another way to represent matrices are the arrays implemented by the standard module array-- more constrained with respect to the stored type, homogeneous value. The elements store the value directly. (The list stores the references to the value objects instead). This way, it is more memory efficient and also the access to the value is faster.

表示矩阵的另一种方法是由标准模块实现的数组array——在存储类型、同构值方面受到更多限制。元素直接存储值。（该列表存储对值对象的引用）。这样，它的内存效率更高，并且对值的访问也更快。

Sometimes, you may find useful even more restricted representation like bytearray.

有时，您可能会发现有用的更受限制的表示，例如bytearray。

Answer 4

回答by Vineet Jain

There are two excellent graph libraries NetworkXand igraph. You can find both library source codes on GitHub. You can always see how the functions are written. But I prefer NetworkX because its easy to understand.
See their codes to know how they make the functions. You will get multiple ideas and then can choose how you want to make a graph using data structures.

有两个优秀的图形库 NetworkX和igraph。您可以在 GitHub 上找到这两个库源代码。您始终可以看到函数是如何编写的。但我更喜欢 NetworkX，因为它易于理解。
查看他们的代码以了解他们如何制作这些功能。您将获得多个想法，然后可以选择如何使用数据结构制作图形。

在 Python 中表示图（数据结构）

提问by shad0w_wa1k3r

采纳答案by mVChr

回答by jterrace

回答by pepr

回答by Vineet Jain

相关推荐

最近更新

标签

在 Python 中表示图（数据结构）

提问by shad0w_wa1k3r

采纳答案by mVChr

回答by jterrace

回答by pepr

回答by Vineet Jain

相关推荐

如何使用python打开受密码保护的excel文件？

Python 通过在两个现有列上使用 lambda 函数在 Panda 中创建一个新列

Python Matplotlib 散点标记大小

Python 如何将新列添加到 Spark DataFrame（使用 PySpark）？

相关推荐

最近更新

标签