C++ 计算组合数量

Question

提问by nhaa123

Cheers,

干杯，

I know you can get the amount of combinations with the following formula (without repetition and order is not important):

我知道你可以用下面的公式得到组合的数量（不重复，顺序不重要）：

// Choose r from n

n! / r!(n - r)!

However, I don't know how to implement this in C++, since for instance with

但是，我不知道如何在 C++ 中实现这一点，因为例如

n = 52

n! = 8,0658175170943878571660636856404e+67

the number gets way too big even for unsigned __int64(or unsigned long long). Is there some workaround to implement the formula without any third-party "bigint" -libraries?

即使对于unsigned __int64（或unsigned long long），这个数字也变得太大了。是否有一些解决方法可以在没有任何第三方“bigint”库的情况下实现公式？

Answer 1

回答by Andreas Brinck

Here's an ancient algorithm which is exact and doesn't overflow unless the result is to big for a long long

这是一个古老的算法，它是精确的并且不会溢出，除非结果太大 long long

unsigned long long
choose(unsigned long long n, unsigned long long k) {
    if (k > n) {
        return 0;
    }
    unsigned long long r = 1;
    for (unsigned long long d = 1; d <= k; ++d) {
        r *= n--;
        r /= d;
    }
    return r;
}

This algorithm is also in Knuth's "The Art of Computer Programming, 3rd Edition, Volume 2: Seminumerical Algorithms" I think.

我认为这个算法也在 Knuth 的“计算机编程艺术，第 3 版，第 2 卷：半数值算法”中。

UPDATE:There's a small possibility that the algorithm will overflow on the line:

更新：算法在线上溢出的可能性很小：

r *= n--;

for verylarge n. A naive upper bound is sqrt(std::numeric_limits<long long>::max())which means an nless than rougly 4,000,000,000.

对于非常大的 n。一个天真的上限是sqrt(std::numeric_limits<long long>::max())指n小于大约 4,000,000,000。

Answer 2

回答by Howard Hinnant

From Andreas' answer:

从安德烈亚斯的回答：

Here's an ancient algorithm which is exact and doesn't overflow unless the result is to big for a long long
unsigned long long
choose(unsigned long long n, unsigned long long k) {
    if (k > n) {
        return 0;
    }
    unsigned long long r = 1;
    for (unsigned long long d = 1; d <= k; ++d) {
        r *= n--;
        r /= d;
    }
    return r;
}
This algorithm is also in Knuth's "The Art of Computer Programming, 3rd Edition, Volume 2: Seminumerical Algorithms" I think.
UPDATE:There's a small possibility that the algorithm will overflow on the line:
r *= n--;
for verylarge n. A naive upper bound is sqrt(std::numeric_limits<long long>::max())which means an nless than rougly 4,000,000,000.

这是一个古老的算法，它是精确的并且不会溢出，除非结果太大 long long
unsigned long long
choose(unsigned long long n, unsigned long long k) {
    if (k > n) {
        return 0;
    }
    unsigned long long r = 1;
    for (unsigned long long d = 1; d <= k; ++d) {
        r *= n--;
        r /= d;
    }
    return r;
}
我认为这个算法也在 Knuth 的“计算机编程艺术，第 3 版，第 2 卷：半数值算法”中。
更新：算法在线上溢出的可能性很小：
r *= n--;
对于非常大的 n。一个天真的上限是sqrt(std::numeric_limits<long long>::max())指n小于大约 4,000,000,000。

Consider n == 67 and k == 33. The above algorithm overflows with a 64 bit unsigned long long. And yet the correct answer is representable in 64 bits: 14,226,520,737,620,288,370. And the above algorithm is silent about its overflow, choose(67, 33) returns:

考虑 n == 67 和 k == 33。上述算法溢出 64 位 unsigned long long。然而正确答案可以用 64 位表示：14,226,520,737,620,288,370。并且上面的算法没有提及它的溢出，choose(67, 33) 返回：

8,829,174,638,479,413

A believable but incorrect answer.

一个可信但不正确的答案。

However the above algorithm can be slightly modified to never overflow as long as the final answer is representable.

但是，只要最终答案是可表示的，就可以稍微修改上述算法以使其永不溢出。

The trick is in recognizing that at each iteration, the division r/d is exact. Temporarily rewriting:

诀窍在于认识到在每次迭代中，除法 r/d 是精确的。临时改写：

r = r * n / d;
--n;

For this to be exact, it means if you expanded r, n and d into their prime factorizations, then one could easily cancel out d, and be left with a modified value for n, call it t, and then the computation of r is simply:

准确地说，这意味着如果你将 r、n 和 d 展开为它们的素因数分解，那么可以很容易地抵消 d，并留下 n 的修改值，称为 t，然后 r 的计算是简单地：

// compute t from r, n and d
r = r * t;
--n;

A fast and easy way to do this is to find the greatest common divisor of r and d, call it g:

一种快速简便的方法是找到 r 和 d 的最大公约数，称为 g：

unsigned long long g = gcd(r, d);
// now one can divide both r and d by g without truncation
r /= g;
unsigned long long d_temp = d / g;
--n;

Now we can do the same thing with d_temp and n (find the greatest common divisor). However since we know a-priori that r * n / d is exact, then we also know that gcd(d_temp, n) == d_temp, and therefore we don't need to compute it. So we can divide n by d_temp:

现在我们可以用 d_temp 和 n（找到最大公约数）做同样的事情。然而，既然我们知道 r * n / d 是精确的先验，那么我们也知道 gcd(d_temp, n) == d_temp，因此我们不需要计算它。所以我们可以将 n 除以 d_temp：

unsigned long long g = gcd(r, d);
// now one can divide both r and d by g without truncation
r /= g;
unsigned long long d_temp = d / g;
// now one can divide n by d/g without truncation
unsigned long long t = n / d_temp;
r = r * t;
--n;

Cleaning up:

打扫干净：

unsigned long long
gcd(unsigned long long x, unsigned long long y)
{
    while (y != 0)
    {
        unsigned long long t = x % y;
        x = y;
        y = t;
    }
    return x;
}

unsigned long long
choose(unsigned long long n, unsigned long long k)
{
    if (k > n)
        throw std::invalid_argument("invalid argument in choose");
    unsigned long long r = 1;
    for (unsigned long long d = 1; d <= k; ++d, --n)
    {
        unsigned long long g = gcd(r, d);
        r /= g;
        unsigned long long t = n / (d / g);
        if (r > std::numeric_limits<unsigned long long>::max() / t)
           throw std::overflow_error("overflow in choose");
        r *= t;
    }
    return r;
}

Now you can compute choose(67, 33) without overflow. And if you try choose(68, 33), you'll get an exception instead of a wrong answer.

现在您可以计算 choose(67, 33) 而不溢出。如果你尝试选择(68, 33)，你会得到一个异常而不是错误的答案。

Answer 3

回答by nhaa123

The following routine will compute the n-choose-k, using the recursive definition and memoization. The routine is extremelyfast and accurate:

以下例程将使用递归定义和记忆来计算 n-choose-k。该例程非常快速和准确：

inline unsigned long long n_choose_k(const unsigned long long& n,
                                     const unsigned long long& k)
{
   if (n  < k) return 0;
   if (0 == n) return 0;
   if (0 == k) return 1;
   if (n == k) return 1;
   if (1 == k) return n;       
   typedef unsigned long long value_type;
   value_type* table = new value_type[static_cast<std::size_t>(n * n)];
   std::fill_n(table,n * n,0);
   class n_choose_k_impl
   {
   public:

      n_choose_k_impl(value_type* table,const value_type& dimension)
      : table_(table),
        dimension_(dimension)
      {}

      inline value_type& lookup(const value_type& n, const value_type& k)
      {
         return table_[dimension_ * n + k];
      }

      inline value_type compute(const value_type& n, const value_type& k)
      {
         if ((0 == k) || (k == n))
            return 1;
         value_type v1 = lookup(n - 1,k - 1);
         if (0 == v1)
            v1 = lookup(n - 1,k - 1) = compute(n - 1,k - 1);
         value_type v2 = lookup(n - 1,k);
         if (0 == v2)
            v2 = lookup(n - 1,k) = compute(n - 1,k);
         return v1 + v2;
      }

      value_type* table_;
      value_type dimension_;
   };
   value_type result = n_choose_k_impl(table,n).compute(n,k);
   delete [] table;
   return result;
}

Answer 4

回答by altariste

Remember that

请记住

n! / ( n - r )! = n * ( n - 1) * .. * (n - r + 1 )

so it's way smaller than n!. So the solution is to evaluate n* ( n - 1 ) * ... * ( n - r + 1) instead of first calculating n! and then dividing it .

所以它比 n! 小得多。所以解决方案是评估 n* ( n - 1 ) * ... * ( n - r + 1) 而不是先计算 n！然后分割它。

Of course it all depends on the relative magnitude of n and r - if r is relatively big compared to n, then it still won't fit.

当然，这完全取决于 n 和 r 的相对大小——如果 r 与 n 相比相对较大，那么它仍然不适合。

Answer 5

回答by nhaa123

Well, I have to answer to my own question. I was reading about Pascal's triangle and by accident noticed that we can calculate the amount of combinations with it:

好吧，我必须回答我自己的问题。我正在阅读 Pascal's triangle 并偶然注意到我们可以用它计算组合的数量：

#include <iostream>
#include <boost/cstdint.hpp>

boost::uint64_t Combinations(unsigned int n, unsigned int r)
{
    if (r > n)
        return 0;

    /** We can use Pascal's triange to determine the amount
      * of combinations. To calculate a single line:
      *
      * v(r) = (n - r) / r
      *
      * Since the triangle is symmetrical, we only need to calculate
      * until r -column.
      */

    boost::uint64_t v = n--;

    for (unsigned int i = 2; i < r + 1; ++i, --n)
        v = v * n / i;

    return v;
}

int main()
{
    std::cout << Combinations(52, 5) << std::endl;
}

Answer 6

回答by nhaa123

Getting the prime factorization of the binomial coefficient is probably the most efficient way to calculate it, especially if multiplication is expensive. This is certainly true of the related problem of calculating factorial (see Click herefor example).

获得二项式系数的质因数分解可能是计算它的最有效方法，尤其是在乘法很昂贵的情况下。这对于计算阶乘的相关问题当然是正确的（例如，请参见单击此处）。

Here is a simple algorithm based on the Sieve of Eratosthenes that calculates the prime factorization. The idea is basically to go through the primes as you find them using the sieve, but then also to calculate how many of their multiples fall in the ranges [1, k] and [n-k+1,n]. The Sieve is essentially an O(n \log \log n) algorithm, but there is no multiplication done. The actual number of multiplications necessary once the prime factorization is found is at worst O\left(\frac{n \log \log n}{\log n}\right) and there are probably faster ways than that.

这是一个基于 Eratosthenes 筛法的简单算法，用于计算质因数分解。这个想法基本上是在使用筛子找到素数时遍历它们，然后还要计算它们的倍数中有多少落在 [1, k] 和 [n-k+1,n] 范围内。Sieve 本质上是一个 O(n \log \log n) 算法，但没有进行乘法运算。找到质因数分解后所需的实际乘法次数最差为 O\left(\frac{n \log \log n}{\log n}\right) 并且可能有比这更快的方法。

prime_factors = []

n = 20
k = 10

composite = [True] * 2 + [False] * n

for p in xrange(n + 1):
if composite[p]:
    continue

q = p
m = 1
total_prime_power = 0
prime_power = [0] * (n + 1)

while True:

    prime_power[q] = prime_power[m] + 1
    r = q

    if q <= k:
        total_prime_power -= prime_power[q]

    if q > n - k:
        total_prime_power += prime_power[q]

    m += 1
    q += p

    if q > n:
        break

    composite[q] = True

prime_factors.append([p, total_prime_power])

 print prime_factors

Answer 7

回答by online6731

One of SHORTEST way :

最短的方法之一：

int nChoosek(int n, int k){
    if (k > n) return 0;
    if (k == 0) return 1;
    return nChoosek(n - 1, k) + nChoosek(n - 1, k - 1);
}

Answer 8

回答by R.Falque

Using a dirty trick with a long double, it is possible to get the same accuracy as Howard Hinnant (and probably more):

使用带有 long double 的肮脏技巧，可以获得与 Howard Hinnant 相同的准确度（并且可能更高）：

unsigned long long n_choose_k(int n, int k)
{
    long double f = n;
    for (int i = 1; i<k+1; i++)
        f /= i;
    for (int i=1; i<k; i++)
        f *= n - i;

    unsigned long long f_2 = std::round(f);

    return f_2;
}

The idea is to divide first by k! and then to multiply by n(n-1)...(n-k+1). The approximation through the double can be avoided by inverting the order of the for loop.

这个想法是先除以k！然后乘以 n(n-1)...(n-k+1)。通过反转 for 循环的顺序可以避免通过 double 的近似。

Answer 9

回答by int3

If you want to be 100% sure that no overflows occur so long as the final result is within the numeric limit, you can sum up Pascal's Triangle row-by-row:

如果您想 100% 确定只要最终结果在数字限制内就不会发生溢出，您可以逐行总结 Pascal 三角形：

for (int i=0; i<n; i++) {
    for (int j=0; j<=i; j++) {
        if (j == 0) current_row[j] = 1;
        else current_row[j] = prev_row[j] + prev_row[j-1];
    }
    prev_row = current_row; // assume they are vectors
}
// result is now in current_row[r-1]

However, this algorithm is much slower than the multiplication one. So perhaps you could use multiplication to generate all the cases you know that are 'safe' and then use addition from there. (.. or you could just use a BigInt library).

然而，这种算法比乘法慢得多。因此，也许您可以使用乘法来生成您知道“安全”的所有情况，然后从那里使用加法。（.. 或者你可以只使用 BigInt 库）。

C++ 计算组合数量

提问by nhaa123

回答by Andreas Brinck

回答by Howard Hinnant

回答by nhaa123

回答by altariste

回答by nhaa123

回答by nhaa123

回答by online6731

回答by R.Falque

回答by int3

相关推荐

最近更新

标签

C++ 计算组合数量

提问by nhaa123

回答by Andreas Brinck

回答by Howard Hinnant

回答by nhaa123

回答by altariste

回答by nhaa123

回答by nhaa123

回答by online6731

回答by R.Falque

回答by int3

相关推荐

C/C++ 程序的最大堆栈大小

使用 C++ 将二进制文件 (jpg) 读取为字符串

C++ 为什么要在 double 上使用 float 或 double 在 long double 上？

C++ 在不使用第三个变量的情况下交换两个变量值

相关推荐

最近更新

标签