C++ 有没有更好的方法来做字符串的排列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1995328/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Are there any better methods to do permutation of string?
提问by Jichao
void permute(string elems, int mid, int end)
{
static int count;
if (mid == end) {
cout << ++count << " : " << elems << endl;
return ;
}
else {
for (int i = mid; i <= end; i++) {
swap(elems, mid, i);
permute(elems, mid + 1, end);
swap(elems, mid, i);
}
}
}
The above function shows the permutations of str
(with str[0..mid-1]
as a steady prefix, and str[mid..end]
as a permutable suffix). So we can use permute(str, 0, str.size() - 1)
to show all the permutations of one string.
上面的函数显示了str
(str[0..mid-1]
作为一个稳定的前缀,str[mid..end]
作为一个可置换的后缀)的排列。所以我们可以用permute(str, 0, str.size() - 1)
来显示一个字符串的所有排列。
But the function uses a recursive algorithm; maybe its performance could be improved?
但该函数使用递归算法;也许它的性能可以提高?
Are there any better methods to permute a string?
有没有更好的方法来置换字符串?
回答by Permaquid
Here is a non-recursive algorithm in C++ from the Wikipedia entry for unordered generation of permutations. For the string s
of length n
, for any k
from 0
to n! - 1
inclusive, the following modifies s
to provide a unique permutation (that is, different from those generated for any other k value on that range). To generate all permutations, run it for all n! k
values on the original value of s.
这是来自维基百科条目的 C++ 非递归算法,用于无序生成排列。对于s
length的字符串n
,对于任何k
from0
到n! - 1
inclusive,以下修改s
以提供唯一的排列(即,不同于为该范围内的任何其他 k 值生成的排列)。要生成所有排列,请为所有 n 运行它!k
s 的原始值上的值。
#include <algorithm>
void permutation(int k, string &s)
{
for(int j = 1; j < s.size(); ++j)
{
std::swap(s[k % (j + 1)], s[j]);
k = k / (j + 1);
}
}
Here swap(s, i, j)
swaps position i and j of the string s.
这里swap(s, i, j)
交换字符串 s 的位置 i 和 j。
回答by Prasoon Saurav
Why dont you try std::next_permutation()
or std::prev_permutation()
?
你为什么不尝试std::next_permutation()
或std::prev_permutation()
?
Links:
链接:
std::next_permutation()
std::prev_permutation()
std::next_permutation()
std::prev_permutation()
A simple example:
一个简单的例子:
#include<string>
#include<iostream>
#include<algorithm>
int main()
{
std::string s="123";
do
{
std::cout<<s<<std::endl;
}while(std::next_permutation(s.begin(),s.end()));
}
Output:
输出:
123
132
213
231
312
321
回答by Jeff Dege
I'd like to second Permaquid's answer. The algorithm he cites works in a fundamentally different way from the various permutation enumeration algorithms that have been offered. It doesn't generate all of the permutations of n objects, it generates a distinct specific permutation, given an integer between 0 and n!-1
. If you need only a specific permutation, it's much faster than enumerating them all and then selecting one.
我想支持Permaquid 的回答。他引用的算法与已提供的各种排列枚举算法的工作方式完全不同。它不会生成 n 个对象的所有排列,而是生成一个不同的特定排列,给定 之间的整数0 and n!-1
。如果您只需要一个特定的排列,它比枚举它们然后选择一个要快得多。
Even if you do need all permutations, it provides options that a single permutation enumeration algorithm does not. I once wrote a brute-force cryptarithm cracker, that tried every possible assignment of letters to digits. For base-10
problems, it was adequate, since there are only 10!
permutations to try. But for base-11
problems took a couple of minutes and base-12
problems took nearly an hour.
即使您确实需要所有排列,它也提供了单个排列枚举算法所没有的选项。我曾经写过一个蛮力密码破解器,它尝试了所有可能的字母到数字的分配。对于base-10
问题,这已经足够了,因为只有10!
排列可以尝试。但是base-11
问题需要几分钟,base-12
问题需要近一个小时。
I replaced the permutation enumeration algorithm that I had been using with a simple i=0--to--N-1
for-loop, using the algorithm Permaquid cited. The result was only slightly slower. But then I split the integer range in quarters, and ran four for-loops simultaneously, each in a separate thread. On my quad-core processor, the resulting program ran nearly four times as fast.
我i=0--to--N-1
使用 Permaquid 引用的算法,用一个简单的for 循环替换了我一直在使用的排列枚举算法。结果只是稍微慢了一点。但后来我将整数范围分成四等分,并同时运行四个 for 循环,每个循环在一个单独的线程中。在我的四核处理器上,生成的程序运行速度几乎是其四倍。
Just as finding an individual permutation using the permutation enumeration algorithms is difficult, generating delineated subsets of the set of all permutations is also difficult. The algorithm that Permaquid cited makes both of these very easy
正如使用排列枚举算法找到单个排列很困难一样,生成所有排列的集合的描述子集也很困难。Permaquid 引用的算法使这两种方法都变得非常简单
回答by Kornel Kisielewicz
In particular, you want std::next_permutation.
特别是,您需要std::next_permutation。
void permute(string elems, int mid, int end)
{
int count = 0;
while(next_permutation(elems.begin()+mid, elems.end()))
cout << << ++count << " : " << elems << endl;
}
... or something like that...
... 或类似的东西...
回答by JohnE
Any algorithm for generating permutations is going to run in polynomial time, because the number of permutations for characters within an n-length string is (n!)
. That said, there are some pretty simple in-place algorithms for generating permutations. Check out the Johnson-Trotter algorithm.
任何生成排列的算法都将在多项式时间内运行,因为 n 长度字符串中字符的排列数是(n!)
。也就是说,有一些非常简单的就地算法来生成排列。查看Johnson-Trotter 算法。
回答by David R Tribble
The Knuth random shuffle algorithmis worth looking into.
该克努特随机洗牌算法是值得研究的。
// In-place shuffle of char array
void shuffle(char array[], int n)
{
for ( ; n > 1; n--)
{
// Pick a random element to move to the end
int k = rand() % n; // 0 <= k <= n-1
// Simple swap of variables
char tmp = array[k];
array[k] = array[n-1];
array[n-1] = tmp;
}
}
回答by JPvdMerwe
Any algorithm that makes use of or generates all permutations will take O(N!*N) time, O(N!) at the least to generate all permutations and O(N) to use the result, and that's really slow. Note that printing the string is also O(N) afaik.
任何使用或生成所有排列的算法都需要 O(N!*N) 时间,至少需要 O(N!) 来生成所有排列和 O(N) 来使用结果,这真的很慢。请注意,打印字符串也是 O(N) afaik。
In a second you can realistically only handle strings up to a maximum of 10 or 11 characters, no matter what method you use. Since 11!*11 = 439084800 iterations (doing this many in a second on most machines is pushing it) and 12!*12 = 5748019200 iterations. So even the fastest implementation would take about 30 to 60 seconds on 12 characters.
无论您使用何种方法,您实际上在一秒钟内只能处理最多 10 或 11 个字符的字符串。由于 11!*11 = 439084800 次迭代(在大多数机器上在一秒钟内执行这么多次迭代)和 12!*12 = 5748019200 次迭代。所以即使是最快的实现也需要大约 30 到 60 秒的 12 个字符。
Factorial just grows too fast for you to hope to gain anything by writing a faster implementation, you'd at most gain one character. So I'd suggest Prasoon's recommendation. It's easy to code and it's quite fast. Though sticking with your code is completely fine as well.
Factorial 增长得太快了,你不希望通过编写一个更快的实现来获得任何东西,你最多只能获得一个字符。所以我建议Prasoon的推荐。编码很容易,而且速度非常快。尽管坚持使用您的代码也完全没问题。
I'd just recommend that you take care that you don't inadvertantly have extra characters in your string such as the null character. Since that will make your code a factor of N slower.
我只是建议您注意不要在字符串中无意中包含额外的字符,例如空字符。因为这会使您的代码慢 N 倍。
回答by Potatoswatter
Do you want to run through all the permutations, or count the number of permutations?
你是想遍历所有的排列,还是计算排列的数量?
For the former, use std::next_permutation
as suggested by others. Each permutation takes O(N) time (but less amortized time) and no memory except its callframe, vs O(N) time and O(N) memory for your recursive function. The whole process is O(N!) and you can't do better than this, as others said, because you can't get more than O(X) results from a program in less than O(X) time! Without a quantum computer, anyway.
对于前者,请std::next_permutation
按照其他人的建议使用。每个排列需要 O(N) 时间(但摊销时间更少)并且除了它的调用帧之外没有任何内存,而您的递归函数则需要 O(N) 时间和 O(N) 内存。整个过程是 O(N!) 并且你不能做得比这更好,正如其他人所说,因为你不能在少于 O(X) 的时间内从程序中获得超过 O(X) 的结果!无论如何,没有量子计算机。
For the latter, you just need to know how many unique elements are in the string.
对于后者,您只需要知道字符串中有多少唯一元素。
big_int count_permutations( string s ) {
big_int divisor = 1;
sort( s.begin(), s.end() );
for ( string::iterator pen = s.begin(); pen != s.end(); ) {
size_t cnt = 0;
char value = * pen;
while ( pen != s.end() && * pen == value ) ++ cnt, ++ pen;
divisor *= big_int::factorial( cnt );
}
return big_int::factorial( s.size() ) / divisor;
}
Speed is bounded by the operation of finding duplicate elements, which for char
s can be done in O(N) time with a lookup table.
速度受到查找重复元素的操作的限制,对于char
s 可以使用查找表在 O(N) 时间内完成。
回答by Omnifarious
I don't think this is better, but it does work and does not use recursion:
我不认为这更好,但它确实有效并且不使用递归:
#include <iostream>
#include <stdexcept>
#include <tr1/cstdint>
::std::uint64_t fact(unsigned int v)
{
::std::uint64_t output = 1;
for (unsigned int i = 2; i <= v; ++i) {
output *= i;
}
return output;
}
void permute(const ::std::string &s)
{
using ::std::cout;
using ::std::uint64_t;
typedef ::std::string::size_type size_t;
static unsigned int max_size = 20; // 21! > 2^64
const size_t strsize = s.size();
if (strsize > max_size) {
throw ::std::overflow_error("This function can only permute strings of size 20 or less.");
} else if (strsize < 1) {
return;
} else if (strsize == 1) {
cout << "0 : " << s << '\n';
} else {
const uint64_t num_perms = fact(s.size());
// Go through each permutation one-by-one
for (uint64_t perm = 0; perm < num_perms; ++perm) {
// The indexes of the original characters in the new permutation
size_t idxs[max_size];
// The indexes of the original characters in the new permutation in
// terms of the list remaining after the first n characters are pulled
// out.
size_t residuals[max_size];
// We use div to pull our permutation number apart into a set of
// indexes. This holds what's left of the permutation number.
uint64_t permleft = perm;
// For a given permutation figure out which character from the original
// goes in each slot in the new permutation. We start assuming that
// any character could go in any slot, then narrow it down to the
// remaining characters with each step.
for (unsigned int i = strsize; i > 0; permleft /= i, --i) {
uint64_t taken_char = permleft % i;
residuals[strsize - i] = taken_char;
// Translate indexes in terms of the list of remaining characters
// into indexes in terms of the original string.
for (unsigned int o = (strsize - i); o > 0; --o) {
if (taken_char >= residuals[o - 1]) {
++taken_char;
}
}
idxs[strsize - i] = taken_char;
}
cout << perm << " : ";
for (unsigned int i = 0; i < strsize; ++i) {
cout << s[idxs[i]];
}
cout << '\n';
}
}
}
The fun thing about this is that the only state it uses from permutation to permutation is the number of the permutation, the total number of permutations, and the original string. That means it can be easily encapsulated in an iterator or something like that without having to carefully preserve the exact correct state. It can even be a random access iterator.
有趣的是,它从一个排列到另一个排列使用的唯一状态是排列数、排列总数和原始字符串。这意味着它可以很容易地封装在迭代器或类似的东西中,而无需小心地保留准确的正确状态。它甚至可以是一个随机访问迭代器。
Of course ::std::next_permutation stores the state in the relationships between elements, but that means it can't work on unordered things, and I would really wonder what it does if you have two equal things in the sequence. You can solve that by permuting indexes of course, but that adds slightly more complication.
当然 ::std::next_permutation 将状态存储在元素之间的关系中,但这意味着它不能处理无序的东西,我真的很想知道如果序列中有两个相等的东西它会做什么。当然,您可以通过排列索引来解决这个问题,但这会增加一些复杂性。
Mine will work with any random access iterator range provided it's short enough. And if it isn't, you'll never get through all the permutations anyway.
只要它足够短,我的将适用于任何随机访问迭代器范围。如果不是,则无论如何您都无法完成所有排列。
The basic idea of this algorithm is that every permutation of N items can be enumerated. The total number is N! or fact(N)
. And any given permutation can be thought of as a mapping of source indices from the original sequence into a set of destination indices in the new sequence. Once you have an enumeration of all permutations the only thing left to do is map each permutation number into an actual permutation.
该算法的基本思想是可以枚举 N 项的每一个排列。总数是N!或fact(N)
。并且任何给定的排列都可以被认为是将源索引从原始序列映射到新序列中的一组目标索引。一旦您枚举了所有排列,剩下要做的就是将每个排列数映射到实际排列中。
The first element in the permuted list can be any of the N elements from the original list. The second element can be any of the N - 1 remaining elements, and so on. The algorithm uses the %
operator to pull apart the permutation number into a set of selections of this nature. First it modulo's the permutation number by N to get a number from [0,N). It discards the remainder by dividing by N, then it modulo's it by the size of the list - 1 to get a number from [0,N-1) and so on. That is what the for (i =
loop is doing.
置换列表中的第一个元素可以是原始列表中的 N 个元素中的任何一个。第二个元素可以是剩余的 N - 1 个元素中的任何一个,依此类推。该算法使用%
运算符将排列数分解为一组这种性质的选择。首先,它对置换数取模 N 以从 [0,N) 中得到一个数。它通过除以 N 来丢弃余数,然后将其除以列表的大小 - 1 以从 [0,N-1) 中获得一个数字,依此类推。这就是for (i =
循环正在做的事情。
The second step is translating each number into an index into the original list. The first number is easy because it's just a straight index. The second number is an index into a list that contains every element but the one removed at the first index, and so on. That is what the for (o =
loop is doing.
第二步是将每个数字转换为原始列表的索引。第一个数字很简单,因为它只是一个直索引。第二个数字是一个列表的索引,该列表包含除第一个索引处删除的元素之外的所有元素,依此类推。这就是for (o =
循环正在做的事情。
residuals
is a list of indices into the successively smaller lists. idxs
is a list of indices into the original list. There is a one-one mapping between values in residuals
and idxs
. They each represent the same value in different 'coordinate spaces'.
residuals
是连续较小列表的索引列表。 idxs
是原始列表中的索引列表。residuals
和 中的值之间存在一对一的映射关系idxs
。它们各自在不同的“坐标空间”中代表相同的值。
The answer pointed to by the answer you picked has the same basic idea, but has a much more elegant way of accomplishing the mapping than my rather literal and brute force method. That way will be slightly faster than my method, but they are both about the same speed and they both have the same advantage of random access into permutation space which makes a whole number of things easier, including (as the answer you picked pointed out) parallel algorithms.
您选择的答案所指向的答案具有相同的基本思想,但比我的文字和蛮力方法更优雅地完成映射。这种方式会比我的方法稍快,但它们的速度大致相同,并且它们都具有随机访问排列空间的相同优势,这使许多事情变得更容易,包括(如您选择的答案所指出的)并行算法。
回答by StackedCrooked
I've written a permutation algorithm recently. It uses a vector of type T (template) instead of a string, and it's not super-fast because it uses recursion and there's a lot of copying. But perhaps you can draw some inspiration for the code. You can find the code here.
我最近写了一个置换算法。它使用 T 类型的向量(模板)而不是字符串,并且它不是超快,因为它使用递归并且有很多复制。但也许您可以从代码中汲取一些灵感。您可以在此处找到代码。