C++ 中图形问题的邻接表或邻接矩阵哪个更好?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2218322/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is better, adjacency lists or adjacency matrices for graph problems in C++?
提问by magiix
What is better, adjacency lists or adjacency matrix, for graph problems in C++? What are the advantages and disadvantages of each?
对于 C++ 中的图形问题,邻接表或邻接矩阵哪个更好?各自的优缺点是什么?
回答by Mark Byers
It depends on the problem.
这取决于问题。
- Uses O(n^2) memory
- It is fast to lookup and check for presence or absence of a specific edge
between any two nodes O(1) - It is slow to iterate over all edges
- It is slow to add/delete a node; a complex operation O(n^2)
- It is fast to add a new edge O(1)
- 使用 O(n^2) 内存
- 快速查找和检查
任意两个节点之间是否存在特定边O(1) - 迭代所有边很慢
- 添加/删除节点很慢;复杂的操作 O(n^2)
- 添加新边 O(1) 很快
- Memory usage depends on the number of edges (not number of nodes),
which might save a lot of memory if the adjacency matrix is sparse - Finding the presence or absence of specific edge between any two nodes
is slightly slower than with the matrix O(k); where k is the number of neighbors nodes - It is fast to iterate over all edges because you can access any node neighbors directly
- It is fast to add/delete a node; easier than the matrix representation
- It fast to add a new edge O(1)
- 内存使用取决于边数(而不是节点数),
如果邻接矩阵稀疏,这可能会节省大量内存 - 查找任意两个节点之间特定边的存在或不存在
比使用矩阵 O(k) 稍慢;其中 k 是邻居节点的数量 - 迭代所有边很快,因为您可以直接访问任何节点邻居
- 添加/删除节点很快;比矩阵表示更容易
- 添加新边 O(1) 的速度很快
回答by keyser
This answer is not just for C++ since everything mentioned is about the data structures themselves, regardless of language. And, my answer is assuming that you know the basic structure of adjacency lists and matrices.
这个答案不仅适用于 C++,因为所提到的一切都与数据结构本身有关,而与语言无关。而且,我的回答是假设您知道邻接表和矩阵的基本结构。
Memory
记忆
If memory is your primary concern you can follow this formula for a simple graph that allows loops:
如果内存是您最关心的问题,您可以按照以下公式绘制一个允许循环的简单图形:
An adjacency matrix occupies n2/8 byte space (one bit per entry).
邻接矩阵占用 n 2/8 字节空间(每个条目一位)。
An adjacency list occupies 8e space, where e is the number of edges (32bit computer).
一个邻接表占用8e空间,其中e是边数(32位计算机)。
If we define the density of the graph as d = e/n2(number of edges divided by the maximum number of edges), we can find the "breakpoint" where a list takes up more memory than a matrix:
如果我们将图的密度定义为 d = e/n 2(边数除以最大边数),我们可以找到列表比矩阵占用更多内存的“断点”:
8e > n2/8when d > 1/64
8e > n 2/8当 d > 1/64
So with these numbers (still 32-bit specific) the breakpoint lands at 1/64. If the density (e/n2) is bigger than 1/64, then a matrixis preferable if you want to save memory.
因此,对于这些数字(仍然是 32 位特定的),断点位于1/64。如果密度 (e/n 2) 大于 1/64,那么如果您想节省内存,最好使用矩阵。
You can read about this at wikipedia(article on adjacency matrices) and a lot of other sites.
您可以在维基百科(关于邻接矩阵的文章)和许多其他网站上阅读相关内容。
Side note: One can improve the space-efficiency of the adjacency matrix by using a hash table where the keys are pairs of vertices (undirected only).
旁注:可以通过使用哈希表来提高邻接矩阵的空间效率,其中键是顶点对(仅限无向)。
Iteration and lookup
迭代和查找
Adjacency lists are a compact way of representing only existing edges. However, this comes at the cost of possibly slow lookup of specific edges. Since each list is as long as the degree of a vertex the worst case lookup time of checking for a specific edge can become O(n), if the list is unordered. However, looking up the neighbours of a vertex becomes trivial, and for a sparse or small graph the cost of iterating through the adjacency lists might be negligible.
邻接表是一种仅表示现有边的紧凑方式。然而,这是以可能缓慢查找特定边为代价的。由于每个列表与顶点的度数一样长,因此如果列表是无序的,则检查特定边的最坏情况查找时间可能变为 O(n)。然而,查找顶点的邻居变得微不足道,对于稀疏或小图,遍历邻接列表的成本可能可以忽略不计。
Adjacency matrices on the other hand use more space in order to provide constant lookup time. Since every possible entry exists you can check for the existence of an edge in constant time using indexes. However, neighbour lookup takes O(n) since you need to check all possible neighbours. The obvious space drawback is that for sparse graphs a lot of padding is added. See the memory discussion above for more information on this.
另一方面,邻接矩阵使用更多空间以提供恒定的查找时间。由于每个可能的条目都存在,您可以使用索引在恒定时间内检查边缘的存在。但是,邻居查找需要 O(n),因为您需要检查所有可能的邻居。明显的空间缺点是对于稀疏图,添加了大量填充。有关更多信息,请参阅上面的内存讨论。
If you're still unsure what to use: Most real-world problems produce sparse and/or large graphs, which are better suited for adjacency list representations. They might seem harder to implement but I assure you they aren't, and when you write a BFS or DFS and want to fetch all neighbours of a node they're just one line of code away. However, note that I'm not promoting adjacency lists in general.
如果您仍然不确定使用什么:大多数现实世界的问题都会产生稀疏和/或大图,它们更适合于邻接列表表示。它们可能看起来更难实现,但我向你保证它们不是,当你编写 BFS 或 DFS 并想要获取节点的所有邻居时,它们只是一行代码。但是,请注意,我一般不会推广邻接列表。
回答by John Red
Okay, I've compiled the Time and Space complexities of basic operations on graphs.
The image below should be self-explanatory.
Notice how Adjacency Matrix is preferable when we expect the graph to be dense, and how Adjacency List is preferable when we expect the graph to be sparse.
I've made some assumptions. Ask me if a complexity (Time or Space) needs clarification. (For example, For a sparse graph, I've taken En to be a small constant, as I've assumed that addition of a new vertex will add only a few edges, because we expect the graph to remain sparse even after adding that vertex.)
好的,我已经编译了图上基本操作的时间和空间复杂性。
下面的图片应该是不言自明的。
请注意,当我们期望图形是密集的时,邻接矩阵是如何更可取的,而当我们期望图形是稀疏的时,邻接列表是如何更可取的。
我做了一些假设。问我是否需要澄清复杂性(时间或空间)。(例如,对于稀疏图,我将 En 作为一个小常数,因为我假设添加一个新顶点只会添加几条边,因为我们希望即使在添加之后,图仍会保持稀疏顶点。)
Please tell me if there are any mistakes.
请告诉我是否有任何错误。
回答by Alex Ntousias
It depends on what you're looking for.
这取决于你在寻找什么。
With adjacency matricesyou can answer fast to questions regarding if a specific edge between two vertices belongs to the graph, and you can also have quick insertions and deletions of edges. The downsideis that you have to use excessive space, especially for graphs with many vertices, which is very inefficient especially if your graph is sparse.
使用邻接矩阵,您可以快速回答有关两个顶点之间的特定边是否属于该图的问题,还可以快速插入和删除边。的缺点是,你必须使用过多的空间,尤其是对于有许多顶点,这是非常低效的,特别是如果你的图是稀疏图。
On the other hand, with adjacency listsit is harder to check whether a given edge is in a graph, because you have to search through the appropriate list to find the edge, but they are more space efficient.
另一方面,使用邻接列表更难检查给定的边是否在图中,因为您必须搜索适当的列表才能找到边,但它们的空间效率更高。
Generally though, adjacency lists are the right data structure for most applications of graphs.
不过一般来说,邻接表是大多数图应用的正确数据结构。
回答by Muhammed Kadir
Lets assume we have a graph which has nnumber of nodes and mnumber of edges,
假设我们有一个图,它有n个节点和m个边,
Adjacency Matrix:We are creating a matrix that has nnumber of rows and columns so in memory it will take space that is proportional to n2. Checking if two nodes named as uand vhas an edge between them will take Θ(1) time. For example checking for (1, 2) is an edge will look like as follows in the code:
邻接矩阵:我们正在创建一个具有n行和列数的矩阵,因此在内存中它将占用与 n 2成比例的空间。检查名为u和v 的两个节点之间是否有边将花费 Θ(1) 时间。例如,在代码中检查 (1, 2) 是一条边将如下所示:
if(matrix[1][2] == 1)
If you want to identify all edges, you have to iterate over matrix at this will require two nested loops and it will take Θ(n2). (You may just use the upper triangular part of the matrix to determine all edges but it will be again Θ(n2))
如果要识别所有边,则必须迭代矩阵,这将需要两个嵌套循环,并且需要 Θ(n 2)。(您可以只使用矩阵的上三角部分来确定所有边,但它又是 Θ(n 2))
Adjacency List:We are creating a list that each node also points to another list. Your list will have nelements and each element will point to a list that has number of items that is equal to number of neighbors of this node (look image for better visualization). So it will take space in memory that is proportional to n+m. Checking if (u, v) is an edge will take O(deg(u)) time in which deg(u) equals number of neighbors of u. Because at most, you have to iterate over the list that is pointed by the u. Identifying all edges will take Θ(n+m).
邻接列表:我们正在创建一个列表,每个节点也指向另一个列表。您的列表将有n 个元素,每个元素将指向一个列表,该列表的项目数等于该节点的邻居数(查看图像以获得更好的可视化)。因此,它将占用与n+m成正比的内存空间。检查 (u, v) 是否是一条边将花费 O(deg(u)) 时间,其中 deg(u) 等于 u 的邻居数。因为至多,您必须遍历 u 指向的列表。识别所有边将花费 Θ(n+m)。
Adjacency list of example graph
示例图的邻接表
You should make your choice according to your needs.
Because of my reputation I couldn't put image of matrix, sorry for that
回答by Binary Nerd
If you are looking at graph analysis in C++ probably the first place to start would be the boost graph library, which implements a number of algorithms including BFS.
如果您正在查看 C++ 中的图分析,那么首先要开始的可能是boost 图库,它实现了包括 BFS 在内的许多算法。
EDIT
编辑
This previous question on SO will probably help:
上一个关于 SO 的问题可能会有所帮助:
how-to-create-a-c-boost-undirected-graph-and-traverse-it-in-depth-first-search
如何创建 ac-boost-undirected-graph-and-traverse-it-in-depth-first-searchh
回答by Evgeni Sergeev
This is best answered with examples.
这最好通过示例来回答。
Think of Floyd-Warshallfor example. We have to use an adjacency matrix, or the algorithm will be asymptotically slower.
以Floyd-Warshall为例。我们必须使用邻接矩阵,否则算法会逐渐变慢。
Or what if it's a dense graph on 30,000 vertices? Then an adjacency matrix might make sense, as you'll be storing 1 bit per pair of vertices, rather than the 16 bits per edge (the minimum that you would need for an adjacency list): that's 107 MB, rather than 1.7 GB.
或者如果它是 30,000 个顶点上的密集图呢?那么邻接矩阵可能有意义,因为您将存储每对顶点 1 位,而不是每条边 16 位(邻接列表所需的最小值):即 107 MB,而不是 1.7 GB。
But for algorithms like DFS, BFS (and those that use it, like Edmonds-Karp), Priority-first search (Dijkstra, Prim, A*), etc., an adjacency list is as good as a matrix. Well, a matrix might have a slight edge when the graph is dense, but only by an unremarkable constant factor. (How much? It's a matter of experimenting.)
但对于 DFS、BFS(以及使用它的算法,如 Edmonds-Karp)、优先级优先搜索(Dijkstra、Prim、A*)等算法,邻接表与矩阵一样好。好吧,当图形密集时,矩阵可能有轻微的边缘,但只有一个不起眼的常数因子。(多少钱?这是一个实验的问题。)
回答by deceleratedcaviar
To add to keyser5053's answer about memory usage.
添加到keyser5053关于内存使用的答案。
For any directed graph, an adjacency matrix (at 1 bit per edge) consumes n^2 * (1)
bits of memory.
对于任何有向图,邻接矩阵(每条边 1 位)消耗n^2 * (1)
内存位。
For a complete graph, an adjacency list (with 64 bit pointers) consumes n * (n * 64)
bits of memory, excluding list overhead.
对于完整图,邻接列表(具有 64 位指针)消耗n * (n * 64)
内存位,不包括列表开销。
For an incomplete graph, an adjacency list consumes 0
bits of memory, excluding list overhead.
对于不完整的图,邻接列表消耗0
内存位,不包括列表开销。
For an adjacency list, you can use the follow formula to determine the maximum number of edges (e
) before an adjacency matrix is optimal for memory.
对于邻接表,您可以使用以下公式来确定e
邻接矩阵对内存优化之前的最大边数 ( )。
edges = n^2 / s
to determine the maximum number of edges, where s
is the pointer size of the platform.
edges = n^2 / s
确定最大边数,其中s
是平台的指针大小。
If you're graph is dynamically updating, you can maintain this efficiency with an average edge count (per node) of n / s
.
如果您的图形是动态更新的,您可以通过平均边数(每个节点)保持这种效率n / s
。
Some examples (with 64 bit pointers).
一些示例(使用 64 位指针)。
For a directed graph, where n
is 300, the optimal number of edges per node using an adjacency list is:
对于有向图,其中n
是 300,使用邻接表的每个节点的最佳边数是:
= 300 / 64
= 4
If we plug this into keyser5053's formula, d = e / n^2
(where e
is the total edge count), we can see we are below the break point (1 / s
):
如果我们将其代入 keyser5053 的公式d = e / n^2
(其中e
是总边数),我们可以看到我们低于断点 ( 1 / s
):
d = (4 * 300) / (300 * 300)
d < 1/64
aka 0.0133 < 0.0156
However, 64 bits for a pointer can be overkill. If you instead use 16bit integers as pointer offsets, we can fit up to 18 edges before breaking point.
但是,指针的 64 位可能有点过头了。如果您改为使用 16 位整数作为指针偏移量,我们可以在断点前容纳多达 18 条边。
= 300 / 16
= 18
d = ((18 * 300) / (300^2))
d < 1/16
aka 0.06 < 0.0625
Each of these examples ignore the overhead of the adjacency lists themselves (64*2
for a vector and 64 bit pointers).
这些示例中的每一个都忽略了邻接列表本身的开销(64*2
对于向量和 64 位指针)。
回答by ChrisOdney
Depending on the Adjacency Matrix implementation the 'n' of the graph should be known earlier for an efficient implementation. If the graph is too dynamic and requires expansion of the matrix every now and then that can also be counted as a downside?
根据邻接矩阵的实现,为了有效的实现,应该更早地知道图的“n”。如果图形过于动态并且需要时不时地扩展矩阵,那也可以算作缺点吗?
回答by max
If you use a hash table instead of either adjacency matrix or list, you'll get better or same big-O run-time and space for all operations (checking for an edge is O(1)
, getting all adjacent edges is O(degree)
, etc.).
如果您使用哈希表而不是邻接矩阵或列表,您将获得更好的或相同的大 O 运行时间和所有操作的空间(检查边 is O(1)
,获取所有相邻边 isO(degree)
等)。
There's some constant factor overhead though for both run-time and space (hash table isn't as fast as linked list or array lookup, and takes a decent amount extra space to reduce collisions).
但是对于运行时间和空间都有一些恒定的因素开销(哈希表不如链表或数组查找快,并且需要相当多的额外空间来减少冲突)。