如何获取字符串的所有子序列组合(在 Java 或 C++ 等中)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1617699/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 17:52:53  来源:igfitidea点击:

How to obtain all subsequence combinations of a String (in Java, or C++ etc)

javac++algorithm

提问by AhmetB - Google

Let's say I've a string "12345" I should obtain all subsequence combinations of this string such as:

假设我有一个字符串“12345”,我应该获取该字符串的所有子序列组合,例如:

  1. --> 1 2 3 4 5
  2. --> 12 13 14 15 23 24 25 34 35 45
  3. --> 123 124 125 234 235 345
  4. --> 1234 1235 1245 1345 2345
  5. --> 12345
  1. --> 1 2 3 4 5
  2. --> 12 13 14 15 23 24 25 34 35 45
  3. --> 123 124 125 234 235 345
  4. --> 1234 1235 1245 1345 2345
  5. --> 12345

Please note that I grouped them in different number of chars but not changed their order. I need a method/function does that.

请注意,我将它们按不同数量的字符分组,但没有更改它们的顺序。我需要一个方法/函数来做到这一点。

采纳答案by hughdbrown

You want a powerset. Here are all the questions on StackOverflow that mention powersetsor power sets.

你想要一个powerset。这里是 StackOverflow 上所有提到powersetspower sets 的问题

Here is a basic implementation in python:

这是python中的基本实现:

def powerset(s):
    n = len(s)
    masks = [1<<j for j in xrange(n)]
    for i in xrange(2**n):
        yield [s[j] for j in range(n) if (masks[j] & i)]


if __name__ == '__main__':
    for elem in powerset([1,2,3,4,5]):
        print elem

And here is its output:

这是它的输出:

[]
[1]
[2]
[1, 2]
[3]
[1, 3]
[2, 3]
[1, 2, 3]
[4]
[1, 4]
[2, 4]
[1, 2, 4]
[3, 4]
[1, 3, 4]
[2, 3, 4]
[1, 2, 3, 4]
[5]
[1, 5]
[2, 5]
[1, 2, 5]
[3, 5]
[1, 3, 5]
[2, 3, 5]
[1, 2, 3, 5]
[4, 5]
[1, 4, 5]
[2, 4, 5]
[1, 2, 4, 5]
[3, 4, 5]
[1, 3, 4, 5]
[2, 3, 4, 5]
[1, 2, 3, 4, 5]

Notice that its first result is the empty set. Change the iteration from this for i in xrange(2**n):to this for i in xrange(1, 2**n):if you want to skip an empty set.

请注意,它的第一个结果是空集。如果要跳过空集,请将迭代从 this 更改for i in xrange(2**n):为 this for i in xrange(1, 2**n):

Here is the code adapted to produce string output:

这是适用于生成字符串输出的代码:

def powerset(s):
    n = len(s)
    masks = [1<<j for j in xrange(n)]
    for i in xrange(2**n):
        yield "".join([str(s[j]) for j in range(n) if (masks[j] & i)])


Edit 2009-10-24

编辑 2009-10-24

Okay, I see you are partial to an implementation in Java. I don't know Java, so I'll meet you halfway and give you code in C#:

好的,我看到您偏爱 Java 中的实现。我不会Java,所以我半路见你,给你C#代码:

    static public IEnumerable<IList<T>> powerset<T>(IList<T> s)
    {
        int n = s.Count;
        int[] masks = new int[n];
        for (int i = 0; i < n; i++)
            masks[i] = (1 << i);
        for (int i = 0; i < (1 << n); i++)
        {
            List<T> newList = new List<T>(n);
            for (int j = 0; j < n; j++)
                if ((masks[j] & i) != 0)
                    newList.Add(s[j]);
            yield return newList;
        }
    }

回答by outis

The simplest algorithm for generating subsets of a set of size N is to consider all binary numbers using N bits. Each position in the number represents an element from the set. If a bit in the number is 1, the corresponding set element is in the subset, otherwise the element isn't in the subset. Since the bits in a number are ordered, this preserves the ordering of the original set.

生成一组大小为 N 的子集的最简单算法是使用 N 位考虑所有二进制数。数字中的每个位置代表集合中的一个元素。如果数字中的某个位为 1,则对应的集合元素在子集中,否则该元素不在子集中。由于数字中的位是有序的,因此保留了原始集合的顺序。

References:

参考:

  1. "Efficiently Enumerating the Subsets of a Set"; Loughry, Hemert and Schoofs
  2. "Generating Subsets"; Stony Brook Algorithm Repository
  1. 有效枚举集合的子集”;Loughry、Hemert 和 Schofs
  2. "生成子集"; 石溪算法库

回答by outis

In C++ given the following routine:

在 C++ 中给出以下例程:

template <typename Iterator>
bool next_combination(const Iterator first, Iterator k, const Iterator last)
{
   /* Credits: Mark Nelson http://marknelson.us */
   if ((first == last) || (first == k) || (last == k))
      return false;
   Iterator i1 = first;
   Iterator i2 = last;
   ++i1;
   if (last == i1)
      return false;
   i1 = last;
   --i1;
   i1 = k;
   --i2;
   while (first != i1)
   {
      if (*--i1 < *i2)
      {
         Iterator j = k;
         while (!(*i1 < *j)) ++j;
         std::iter_swap(i1,j);
         ++i1;
         ++j;
         i2 = k;
         std::rotate(i1,j,last);
         while (last != j)
         {
            ++j;
            ++i2;
         }
         std::rotate(k,i2,last);
         return true;
      }
   }
   std::rotate(first,k,last);
   return false;
}

You can then proceed to do the following:

然后,您可以继续执行以下操作:

std::string s = "12345";
for(std::size_t i = 1; i <= s.size(); ++i)
{
   do
   {
      std::cout << std::string(s.begin(),s.begin() + i) << std::endl;
   }
   while(next_combination(s.begin(),s.begin() + i,s.end()));
}

回答by Adrien Plisson

using python, the itertools module defines a combinations() method which does just what you need.

使用python,itertools 模块定义了一个组合() 方法,它可以满足您的需求。

from itertools import *
list(combinations( '12345', 2 ))

will give you:

会给你:

[('1', '2'), ('1', '3'), ('1', '4'), ('1', '5'), ('2', '3'), ('2', '4'), ('2', '5'), ('3', '4'), ('3', '5'), ('4', '5')]

回答by Stephan202

Adrien Plisson's answershows how one retrieves all subsequences of a specified length in Python (for arbitrary sequence data types). The OP specifies that he works with strings, and that he wants allsubsequences. Thus, using itertools.combinationswe define:

Adrien Plisson 的回答显示了如何在 Python 中检索指定长度的所有子序列(对于任意序列数据类型)。OP 指定他使用字符串,并且他想要所有子序列。因此,使用itertools.combinations我们定义:

>>> from itertools import combinations
>>> def subseq_combos(inp):
...     return (''.join(s) for r in range(len(inp) + 1) for s in combinations(inp, r))
... 
>>> list(subseq_combos('12345'))
['', '1', '2', '3', '4', '5', '12', '13', '14', '15', '23', '24', '25', '34', '35', '45', '123', '124', '125', '134', '135', '145', '234', '235', '245', '345', '1234', '1235', '1245', '1345', '2345', '12345']

(If the empty subsequence should be omitted, then use range(1, len(inp) + 1)).)

(如果应省略空子序列,则使用range(1, len(inp) + 1)).)

回答by sergtk

You can use the following class for this (in Java):

您可以为此使用以下类(在 Java 中):

class Combinations {

  String input;
  StringBuilder cur;

  private void next(int pos, int reminder) {
    cur.append(input.charAt(pos));

    if (reminder == 1) {
      System.out.println(cur);
    } else {
      for (int i = pos + 1; i + reminder - 1 <= input.length(); i++)
        next(i, reminder - 1);
    }
    cur.deleteCharAt(cur.length() - 1);
  }

  public void generate(String input) {
    cur = new StringBuilder();
    this.input = input;
    for (int length = 1; length <= input.length(); length++)
      for (int pos = 0; pos + length <= input.length(); pos++)
        next(pos, length);
  }
}

To run your example use the following code:

要运行您的示例,请使用以下代码:

new Combinations().generate("12345");

The order of the output is the same as in example. It does not require to store all subsets and then sort them to obtain the order you described.

输出顺序与示例中相同。它不需要存储所有子集然后对其进行排序以获得您描述的顺序。

回答by pillmuncher

oops, wrong answer:

哎呀,错误的答案:

Subsequences of a certain length in Python:

Python中一定长度的子序列:

def subseqs(seq, length):
    for i in xrange(len(seq) - length + 1):
        yield seq[i:i+length]

Used like this:

for each in subseqs("hello", 3):
    print each

prints:

hel
ell
llo

To generate all subsequences do this:

for i in xrange(len("hello")):
    for each in subseqs("hello", i + 1):
        print each

prints:

h
e
l
l
o
he
el
ll
lo
hel
ell
llo
hell
ello
hello
def subseqs(seq, length):
    for i in xrange(len(seq) - length + 1):
        yield seq[i:i+length]

像这样使用:

for each in subseqs("hello", 3):
    print each

印刷:

hel
ell
llo

要生成所有子序列,请执行以下操作:

for i in xrange(len("hello")):
    for each in subseqs("hello", i + 1):
        print each

印刷:

h
e
l
l
o
he
el
ll
lo
hel
ell
llo
hell
ello
hello

Mick.

米克。

Now I see, you wanted subsets, not sublists.

现在我明白了,你想要的是子集,而不是子列表。

回答by Senthil kumar M.S

The code to generate all possible combinations of strings is given in java. The all possible combinations of string of length 4 is 2 ^ 4 (2 raised to the power 4). In general for a string of length n the possible combinations are 2 ^ n (2 raised to the power n). Hence the code:

java中给出了生成所有可能的字符串组合的代码。长度为 4 的字符串的所有可能组合是 2 ^ 4(2 的 4 次方)。一般来说,对于长度为 n 的字符串,可能的组合是 2 ^ n(2 的 n 次幂)。因此代码:

    class Perms
    {
    public void permsOfString(String a)
      {
     int x = 1;

     /* 
          Computes 2^string length

     */

     for(int i = 0;i<a.length() ;i++)
     {
         x = x * 2;
     }
     /*
            Iterate through all the possible combinations using a binary value of the number

      */
     for(int i = 1 ;i<x;i++)
     {

         String binStr = Integer.toBinaryString(i); // Convert i to binary string 
         for(int j = binStr.length() ; j <  a.length() ;j++)
         {
             binStr = "0"+binStr; // left pad with 0s
         }
   /*loop through the binary string if a character at the string is '1' note the    index,then display the character of the given string with that index */

          for(int k = 0; k <binStr.length();k++)
          {
             if(binStr.charAt(k) == '0') continue;
             else
             {
                 System.out.print(a.charAt(k));
             }

          }
         System.out.println();

     }

    }
    public static void main(String[]s)
  {
Perms p = new Perms();
p.permsOfString("abcd");
   }
} 

回答by Anupam Gupta

way way cleaner approach can be achieved through recursion as follows.

方式方式更清洁的方法可以通过递归实现,如下所示。

Public class StrManipulation{

    public static void combinations(String suffix,String prefix){
        if(prefix.length()<0)return;
        System.out.println(suffix);
        for(int i=0;i<prefix.length();i++)
         combinations(suffix+prefix.charAt(i),prefix.substring(i+1,prefix.length()));
    }

    public static void main (String args[]){
        combinations("","12345");
        }
}

回答by Madhu S. Kapoor

C implementation

C 实现

//Usage
combinations((char*)"",(char*)"12346897909787");


void combinations(char* suffix,char* prefix){
    if(NULL ==prefix || NULL == suffix){ return ;}
    int prefixLen = strlen(prefix);
    printf("\n[%s]",suffix);
    int slen  = strlen(suffix);
    char* s   = (char*)malloc(slen+2);
    s[slen+1] = '##代码##';
    for(int i=0;i<prefixLen;i++){
        strcpy(s,suffix);
        s[slen]  = prefix[i];
        int npfl = prefixLen-(i+1);
        char* p  = (char*) malloc(npfl+1);
        p[npfl]  = '##代码##';
        strcpy(p,prefix+i+1);
        combinations(s,p);
        free(p);
    }
    free(s);
}