python 获取字符串的每个组合
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1457814/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get every combination of strings
提问by sth
I had a combinatorics assignment that involved getting every word with length less than or equal to 6 from a specific combination of strings.
我有一个组合作业,涉及从特定的字符串组合中获取长度小于或等于 6 的每个单词。
In this case, it was S = { 'a', 'ab', 'ba' }. The professor just started listing them off, but I thought it would be easier solved with a program. The only problem is that I can't get a good algorithm that would actually compute every possible option.
在这种情况下,它是 S = { 'a', 'ab', 'ba' }。教授刚刚开始列出它们,但我认为用程序会更容易解决。唯一的问题是我无法得到一个真正计算所有可能选项的好算法。
If anyone could help, I'd appreciate it. I usually program in Python but really I just need help with the algorithm.
如果有人可以提供帮助,我将不胜感激。我通常用 Python 编程,但实际上我只需要算法方面的帮助。
回答by Alex Martelli
Assuming you DO mean combinations (no repetitions, order does not matter):
假设您确实是组合(没有重复,顺序无关紧要):
import itertools
S = [ 'a', 'ab', 'ba' ]
for i in range(len(S)+1):
for c in itertools.combinations(S, i):
cc = ''.join(c)
if len(cc) <= 6:
print c
emits all the possibilities:
发出所有的可能性:
()
('a',)
('ab',)
('ba',)
('a', 'ab')
('a', 'ba')
('ab', 'ba')
('a', 'ab', 'ba')
If you mean something different than "combinations", it's just an issue of using the right iterator or generator in the for
(e.g., itertools.permutations
, or something else of your own devising).
如果您的意思与“组合”不同,那么这只是在for
(例如itertools.permutations
,或您自己设计的其他东西)中使用正确的迭代器或生成器的问题。
Edit: if for example you mean "repetitions and order ARE important",
编辑:例如,如果您的意思是“重复和顺序很重要”,
def reps(seq, n):
return itertools.product(*[seq]*n)
for i in range(7):
for c in reps(S, i):
cc = ''.join(c)
if len(cc) <= 6:
print c
will give you the required 85 lines of output.
将为您提供所需的 85 行输出。
Edit again: I had the wrong loop limit (and therefore wrong output length) -- tx to the commenter who pointed that out. Also, this approach can produce a string > 1 times, if the ''.join's of different tuples are considered equivalent; e.g., it produces ('a', 'ba') as distinct from ('ab', 'a') although their ''.join is the same (same "word" from different so-called "combinations", I guess -- terminology in use not being entirely clear).
再次编辑:我有错误的循环限制(因此错误的输出长度)——发送给指出这一点的评论者。此外,如果不同元组的 ''.join's 被认为是等效的,则这种方法可以产生大于 1 次的字符串;例如,它产生的 ('a', 'ba') 与 ('ab', 'a') 不同,尽管它们的 ''.join 是相同的(来自不同的所谓“组合”的相同“词”,我猜-- 使用的术语不完全清楚)。
回答by sth
You can iteratively generate all the strings made from one part, two parts, three parts and so on, until all the strings generated in a step are longer than six characters. Further steps would only generate even longer strings, so all possible short strings have already been generated. If you collect these short strings in each step you end up with a set of all possible generated short strings.
您可以迭代生成由一部分、两部分、三部分等组成的所有字符串,直到在一个步骤中生成的所有字符串都超过六个字符。进一步的步骤只会生成更长的字符串,因此所有可能的短字符串都已经生成。如果您在每个步骤中收集这些短字符串,您最终会得到一组所有可能生成的短字符串。
In Python:
在 Python 中:
S = set(['a', 'ab', 'ba'])
collect = set()
step = set([''])
while step:
step = set(a+b for a in step for b in S if len(a+b) <= 6)
collect |= step
print sorted(collect)
回答by I. J. Kennedy
def combos(S,n):
if (n <= 0): return
for s in S:
if len(s) <= n: yield s
for t in combos(S,n-len(s)): yield s+t
for x in combos(["a","ab","ba"],6): print x
Prints output:
打印输出:
a
aa
aaa
aaaa
aaaaa
aaaaaa
aaaaab
aaaaba
aaaab
aaaaba
aaaba
aaabaa
aaab
aaaba
aaabaa
aaabab
aaabba
aaba
aabaa
aabaaa
aabaab
aababa
aab
aaba
aabaa
aabaaa
aabaab
aababa
aabab
aababa
aabba
aabbaa
aba
abaa
abaaa
abaaaa
abaaab
abaaba
abaab
abaaba
ababa
ababaa
ab
aba
abaa
abaaa
abaaaa
abaaab
abaaba
abaab
abaaba
ababa
ababaa
abab
ababa
ababaa
ababab
ababba
abba
abbaa
abbaaa
abbaab
abbaba
ba
baa
baaa
baaaa
baaaaa
baaaab
baaaba
baaab
baaaba
baaba
baabaa
baab
baaba
baabaa
baabab
baabba
baba
babaa
babaaa
babaab
bababa
回答by Anthony Towns
Doing it recursively is one way:
递归执行是一种方法:
cache = {}
def words_of_length(s, n=6):
# memoise results
if n in cache: return cache[n]
# base cases
if n < 0: return []
if n == 0: return [""]
# inductive case
res = set()
for x in s:
res |= set( (x+y for y in words_of_length(s, n-len(x))) )
# sort and memoise result
cache[n] = sorted(res)
# sort results
return cache[n]
def words_no_longer_than(s, n=6):
return sum( [words_of_length(s, i) for i in range(n+1)], [] )