python 在图中查找 3 个节点(或三角形)的循环

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1705824/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:54:31  来源:igfitidea点击:

Finding cycle of 3 nodes ( or triangles) in a graph

pythongraphgeometrycycle

提问by zapa

I am working with complex networks. I want to find group of nodes which forms a cycle of 3 nodes (or triangles) in a given graph. As my graph contains about million edges, using a simple iterative solution (multiple "for" loop) is not very efficient.

我正在处理复杂的网络。我想找到在给定图中形成 3 个节点(或三角形)的循环的节点组。由于我的图形包含大约一百万条边,因此使用简单的迭代解决方案(多个“for”循环)效率不高。

I am using python for my programming, if these is some inbuilt modules for handling these problems, please let me know.

我正在使用 python 进行编程,如果这些是用于处理这些问题的一些内置模块,请告诉我。

If someone knows any algorithm which can be used for finding triangles in graphs, kindly reply back.

如果有人知道任何可用于在图中查找三角形的算法,请回复。

回答by Ajay JM

Assuming its an undirected graph, the answer lies in networkx library of python. if you just need to count triangles, use:

假设它是一个无向图,答案在于python的networkx库。如果您只需要计算三角形,请使用:

import networkx as nx
tri=nx.triangles(g)

But if you need to know the edge list with triangle (triadic) relationship, use

但是如果你需要知道具有三角形(三元)关系的边列表,使用

all_cliques= nx.enumerate_all_cliques(g)

This will give you all cliques (k=1,2,3...max degree - 1)

这会给你所有的派系 (k=1,2,3...max degree - 1)

So, to filter just triangles i.e k=3,

所以,只过滤三角形,即 k=3,

triad_cliques=[x for x in all_cliques if len(x)==3 ]

The triad_cliques will give a edge list with only triangles.

triad_cliques 将给出一个只有三角形的边列表。

回答by wisty

A million edges is quite small. Unless you are doing it thousands of times, just use a naive implementation.

一百万条边非常小。除非你做了数千次,否则只需使用一个简单的实现。

I'll assume that you have a dictionary of node_ids, which point to a sequence of their neighbors, and that the graph is directed.

我假设您有一个 node_ids 字典,它指向一系列邻居,并且该图是有向的。

For example:

例如:

nodes = {}
nodes[0] = 1,2
nodes[1] = tuple() # empty tuple
nodes[2] = 1

My solution:

我的解决方案:

def generate_triangles(nodes):
    """Generate triangles. Weed out duplicates."""
    visited_ids = set() # remember the nodes that we have tested already
    for node_a_id in nodes:
        for node_b_id in nodes[node_a_id]:
            if nod_b_id == node_a_id:
                raise ValueError # nodes shouldn't point to themselves
            if node_b_id in visited_ids:
                continue # we should have already found b->a->??->b
            for node_c_id in nodes[node_b_id]:
                if node_c_id in visited_ids:
                    continue # we should have already found c->a->b->c
                if node_a_id in nodes[node_c_id]:
                    yield(node_a_id, node_b_id, node_c_id)
        visited_ids.add(node_a_id) # don't search a - we already have all those cycles

Checking performance:

检查性能:

from random import randint
n = 1000000
node_list = range(n)
nodes = {}
for node_id in node_list:
    node = tuple()
    for i in range(randint(0,10)): # add up to 10 neighbors
        try:
            neighbor_id = node_list[node_id+randint(-5,5)] # pick a nearby node
        except:
            continue 
        if not neighbor_id in node:
            node = node + (neighbor_id,)
    nodes[node_id] = node

cycles = list(generate_triangles(nodes))
print len(cycles)

When I tried it, it took longer to build the random graph than to count the cycles.

当我尝试它时,构建随机图比计算周期花费的时间更长。

You might want to test it though ;) I won't guarantee that it's correct.

不过,您可能想对其进行测试;) 我不保证它是正确的。

You could also look into networkx, which is the big python graph library.

您还可以查看 networkx,它是大型 Python 图形库。

回答by Ash

Pretty easy and clear way to do is to use Networkx:

非常简单明了的方法是使用 Networkx:

With Networkx you can get the loops of an undirected graph by nx.cycle_basis(G)and then select the ones with 3 nodes

使用 Networkx,您可以通过nx.cycle_basis(G)获得无向图的循环,然后选择具有 3 个节点的循环

cycls_3 = [c for c in nx.cycle_basis(G) if len(c)==3]

or you can find all the cliques by find_cliques(G)and then select the ones you want (with 3 nodes). cliques are sections of the graph where all the nodes are connected to each other which happens in cycles/loops with 3 nodes.

或者你可以通过find_cliques(G)找到所有的派系,然后选择你想要的派系(有 3 个节点)。派系是图中所有节点相互连接的部分,发生在具有 3 个节点的循环/循环中。

回答by J S

I don't want to sound harsh, but have you tried to Google it? The first link is a pretty quick algorithm to do that: http://www.mail-archive.com/[email protected]/msg05642.html

我不想听起来很刺耳,但你试过谷歌吗?第一个链接是一个非常快速的算法:http: //www.mail-archive.com/[email protected]/msg05642.html

And then there is this article on ACM (which you may have access to): http://portal.acm.org/citation.cfm?id=244866(and if you don't have access, I am sure if you kindly ask the lady who wrote it, you will get a copy.)

然后是关于 ACM 的这篇文章(您可以访问):http: //portal.acm.org/citation.cfm?id=244866(如果您没有访问权限,我相信您是否可以问写它的女士,你会得到一份。)

Also, I can imagine a triangle enumeration method based on clique-decomposition, but I don't know if it was described somewhere.

另外,我可以想象一种基于clique-decomposition的三角形枚举方法,但我不知道它是否在某处描述过。

回答by Miss Palmer

Surprised to see no mention of the Networkx triangles function. I know it doesn't necessarily return the groups of nodes that form a triangle, but should be pretty relevant to many who find themselves on this page.

惊讶地看到没有提及 Networkx 三角形函数。我知道它不一定返回形成三角形的节点组,但应该与许多发现自己在此页面上的人非常相关。

nx.triangles(G) # list of how many triangles each node is part of
sum(nx.triangles(G).values())/3 # total number of triangles

An alternative way to return clumps of nodes would be something like...

返回节点块的另一种方法是......

for u,v,d in G.edges(data=True):
    u_array = adj_m.getrow(u).nonzero()[1] # get lists of all adjacent nodes
    v_array = adj_m.getrow(v).nonzero()[1]
    # find the intersection of the two sets - these are the third node of the triangle
    np.intersect1d(v_array,u_array)

回答by Alex Huong Tran

I am working on the same problem of counting number of triangles on undirectedgraph and wisty's solution works really well in my case. I have modified it a bit so only undirected triangles are counted.

我正在研究计算无向图上三角形数量的相同问题,wisty 的解决方案在我的情况下非常有效。我对其进行了一些修改,因此只计算无向三角形。

    #### function for counting undirected cycles
    def generate_triangles(nodes):
        visited_ids = set() # mark visited node
        for node_a_id in nodes:
            temp_visited = set() # to get undirected triangles
            for node_b_id in nodes[node_a_id]:
                if node_b_id == node_a_id:
                    raise ValueError # to prevent self-loops, if your graph allows self-loops then you don't need this condition
                if node_b_id in visited_ids:
                    continue
                for node_c_id in nodes[node_b_id]:
                    if node_c_id in visited_ids:
                        continue    
                    if node_c_id in temp_visited:
                        continue
                    if node_a_id in nodes[node_c_id]:
                        yield(node_a_id, node_b_id, node_c_id)
                    else:
                        continue
                temp_visited.add(node_b_id)
            visited_ids.add(node_a_id)

Of course, you need to use a dictionary for example

当然,你需要使用字典例如

    #### Test cycles ####

    nodes = {}

    nodes[0] = [1, 2, 3]
    nodes[1] = [0, 2]
    nodes[2] = [0, 1, 3]
    nodes[3] = [1]

    cycles = list(generate_triangles(nodes))
    print cycles

Using the code of Wisty, the triangles found will be [(0, 1, 2), (0, 2, 1), (0, 3, 1), (1, 2, 3)]

使用 Wisty 的代码,找到的三角形将是 [(0, 1, 2), (0, 2, 1), (0, 3, 1), (1, 2, 3)]

which counted the triangle (0, 1, 2) and (0, 2, 1) as two different triangles. With the code I modified, these are counted as only one triangle.

将三角形 (0, 1, 2) 和 (0, 2, 1) 计算为两个不同的三角形。用我修改的代码,这些只算一个三角形。

I used this with a relatively small dictionary of under 100 keys and each key has on average 50 values.

我将它与一个少于 100 个键的相对较小的字典一起使用,每个键平均有 50 个值。

回答by James Black

Even though it isn't efficient, you may want to implement a solution, so use the loops. Write a test so you can get an idea as to how long it takes.

尽管效率不高,但您可能希望实现一个解决方案,因此请使用循环。编写一个测试,以便您了解需要多长时间。

Then, as you try new approaches you can do two things: 1) Make certain that the answer remains the same. 2) See what the improvement is.

然后,当您尝试新方法时,您可以做两件事:1) 确保答案保持不变。2)看看有什么改进。

Having a faster algorithm that misses something is probably going to be worse than having a slower one.

拥有一个错过某些东西的更快的算法可能会比拥有一个更慢的算法更糟糕。

Once you have the slow test, you can see if you can do this in parallel and see what the performance increase is.

进行慢速测试后,您可以查看是否可以并行执行此操作,并查看性能提升情况。

Then, you can see if you can mark all nodes that have less than 3 vertices.

然后,您可以查看是否可以标记所有顶点数少于 3 个的节点。

Ideally, you may want to shrink it down to just 100 or so first, so you can draw it, and see what is happening graphically.

理想情况下,您可能希望先将其缩小到 100 左右,这样您就可以绘制它,并以图形方式查看发生的情况。

Sometimes your brain will see a pattern that isn't as obvious when looking at algorithms.

有时,您的大脑会看到在查看算法时不那么明显的模式。

回答by Kirk Broadhurst

Do you need to find 'all' of the 'triangles', or just 'some'/'any'? Or perhaps you just need to test whether a particular node is part of a triangle?

你需要找到“所有”的“三角形”,还是只需要“一些”/“任何”?或者您可能只需要测试特定节点是否是三角形的一部分?

The test is simple - given a node A, are there any two connected nodes B & C that are also directly connected.

测试很简单——给定一个节点 A,是否有任何两个连接的节点 B 和 C 也直接连接。

If you need to find all of the triangles - specifically, all groups of 3 nodes in which each node is joined to the other two - then you need to check every possible group in a very long running 'for each' loop.

如果你需要找到所有的三角形——特别是所有 3 个节点的组,其中每个节点都与另外两个节点相连——那么你需要在一个很长的“for each”循环中检查每个可能的组。

The only optimisation is ensuring that you don't check the same 'group' twice, e.g. if you have already tested that B & C aren't in a group with A, then don't check whether A & C are in a group with B.

唯一的优化是确保您不会两次检查同一个“组”,例如,如果您已经测试过 B & C 与 A 不在一个组中,则不要检查 A & C 是否在一个组中与 B。

回答by dschult

If you don't care about multiple copies of the same triangle in different order then a list of 3-tuples works:

如果您不关心以不同顺序复制同一三角形的多个副本,则 3 元组列表有效:

from itertools import combinations as combos
[(n,nbr,nbr2) for n in G for nbr, nbr2 in combos(G[n],2) if nbr in G[nbr2]]

The logic here is to check each pair of neighbors of every node to see if they are connected. G[n]is a fast way to iterate over or look up neighbors.

这里的逻辑是检查每个节点的每一对邻居,看它们是否相连。G[n]是迭代或查找邻居的快速方法。

If you want to get rid of reorderings, turn each triple into a frozenset and make a set of the frozensets:

如果你想摆脱重新排序,把每个三元组变成一个frozenset并制作一组frozensets:

set(frozenset([n,nbr,nbr2]) for n in G for nbr, nbr2 in combos(G[n]) if nbr in G[nbr2])

If you don't like frozenset and want a list of sets then:

如果你不喜欢frozenset并且想要一个集合列表,那么:

triple_iter = ((n, nbr, nbr2) for n in G for nbr, nbr2 in combos(G[n],2) if nbr in G[nbr2])
triangles = set(frozenset(tri) for tri in triple_iter)
nice_triangles = [set(tri) for tri in triangles]

回答by gibbone

This is a more efficient version of Ajay M answer(I would have commented it, but I've not enough reputation).

这是Ajay M 答案的更有效版本(我会评论它,但我没有足够的声誉)。

Indeed the enumerate_all_cliquesmethod of networkxwill return allcliques in the graph, irrespectively of their length; hence looping over it may take a lot of time (especially with very dense graphs).

事实上,enumerate_all_cliques方法networkxwill 返回图中的所有cliques,而不管它们的长度;因此循环它可能需要很多时间(特别是对于非常密集的图形)。

Moreover, once defined for triangles, it's just a matter of parametrization to generalize the method for every clique length so here's a function:

此外,一旦为三角形定义,它只是一个参数化的问题来概括每个集团长度的方法,所以这里有一个函数:

import networkx as nx

def get_cliques_by_length(G, length_clique):
    """ Return the list of all cliques in an undirected graph G with length 
    equal to length_clique. """
    cliques = []
    for c in nx.enumerate_all_cliques(G) :
        if len(c) <= length_clique:
            if len(c) == length_clique:
                cliques.append(c)            
        else:
            return cliques
    # return empty list if nothing is found
    return cliques

To get triangles just use get_cliques_by_length(G, 3).

要获得三角形只需使用get_cliques_by_length(G, 3).

Caveat: this method works only for undirected graphs. Algorithm for cliques in directed graphs are not provided in networkx

警告:此方法仅适用于无向图。有向图中的派系算法未提供networkx