C/C++ 中的自展开宏循环

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28231743/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 20:55:55  来源:igfitidea点击:

Self-unrolling macro loop in C/C++

c++cboostmacrosloop-unrolling

提问by Karsten

I am currently working on a project, where every cycle counts. While profiling my application I discovered that the overhead of some inner loop is quite high, because they consist of just a few machine instruction. Additionally the number of iterations in these loops is known at compile time.

我目前正在做一个项目,每个周期都很重要。在分析我的应用程序时,我发现一些内部循环的开销非常高,因为它们只包含一些机器指令。此外,这些循环中的迭代次数在编译时是已知的。

So I thought instead of manually unrolling the loop with copy & paste I could use macros to unroll the loop at compile time so that it can be easily modified later.

因此,我认为不是通过复制和粘贴手动展开循环,我可以使用宏在编译时展开循环,以便以后可以轻松修改。

What I image is something like this:

我的形象是这样的:

#define LOOP_N_TIMES(N, CODE) <insert magic here>

So that I can replace for (int i = 0; i < N, ++i) { do_stuff(); }with:

这样我就可以替换for (int i = 0; i < N, ++i) { do_stuff(); }为:

#define INNER_LOOP_COUNT 4
LOOP_N_TIMES(INNER_LOOP_COUNT, do_stuff();)

And it unrolls itself to:

它自己展开:

do_stuff(); do_stuff(); do_stuff(); do_stuff();

Since the C preprocessor is still a mystery to me most of the time, I have no idea how to accomplish this, but I know it must be possible because Boost seems to have a BOOST_PP_REPEATmacros. Unfortunately I can't use Boost for this project.

由于 C 预处理器大部分时间对我来说仍然是个谜,我不知道如何实现这一点,但我知道这一定是可能的,因为 Boost 似乎有一个BOOST_PP_REPEAT宏。不幸的是,我不能在这个项目中使用 Boost。

回答by sehe

You can use templates to unroll. See the disassembly for the sample Live on Godbolt

您可以使用模板展开。请参阅示例Live on Godbolt的拆卸

enter image description here

在此处输入图片说明

But -funroll-loopshas the same effect for this sample.

-funroll-loops对这个样本有同样的效果



Live On Coliru

Live On Coliru

template <unsigned N> struct faux_unroll {
    template <typename F> static void call(F const& f) {
        f();
        faux_unroll<N-1>::call(f);
    }
};

template <> struct faux_unroll<0u> {
    template <typename F> static void call(F const&) {}
};

#include <iostream>
#include <cstdlib>

int main() {
    srand(time(0));

    double r = 0;
    faux_unroll<10>::call([&] { r += 1.0/rand(); });

    std::cout << r;
}

回答by M Oehm

You can use the pre-processor and play some tricks with token concatenation and multiple macro expansion, but you have to hard-code all possibilities:

您可以使用预处理器并使用令牌连接和多个宏扩展来玩一些技巧,但您必须对所有可能性进行硬编码:

#define M_REPEAT_1(X) X
#define M_REPEAT_2(X) X X
#define M_REPEAT_3(X) X X X
#define M_REPEAT_4(X) X X X X
#define M_REPEAT_5(X) X M_REPEAT_4(X)
#define M_REPEAT_6(X) M_REPEAT_3(X) M_REPEAT_3(X)

#define M_EXPAND(...) __VA_ARGS__

#define M_REPEAT__(N, X) M_EXPAND(M_REPEAT_ ## N)(X)
#define M_REPEAT_(N, X) M_REPEAT__(N, X)
#define M_REPEAT(N, X) M_REPEAT_(M_EXPAND(N), X)

And then expand it like this:

然后像这样扩展它:

#define THREE 3

M_REPEAT(THREE, three();)
M_REPEAT(4, four();)
M_REPEAT(5, five();)
M_REPEAT(6, six();)

This method requires literal numbers as counts, you can't do something like this:

此方法需要文字数字作为计数,您不能执行以下操作:

#define COUNT (N + 1)

M_REPEAT(COUNT, stuff();)

回答by Persixty

There's no standard way of doing this.

没有标准的方法来做到这一点。

Here's a slightly bonkers approach:

这是一个有点疯狂的方法:

#define DO_THING printf("Shake it, Baby\n")
#define DO_THING_2 DO_THING; DO_THING
#define DO_THING_4 DO_THING_2; DO_THING_2
#define DO_THING_8 DO_THING_4; DO_THING_4
#define DO_THING_16 DO_THING_8; DO_THING_8
//And so on. Max loop size increases exponentially. But so does code size if you use them. 

void do_thing_25_times(void){
    //Binary for 25 is 11001
    DO_THING_16;//ONE
    DO_THING_8;//ONE
    //ZERO
    //ZERO
    DO_THING;//ONE
}

It's not too much to ask of an optimizer to eliminate dead code. In which case:

要求优化器消除死代码并不过分。在这种情况下:

#define DO_THING_N(N) if(((N)&1)!=0){DO_THING;}\
    if(((N)&2)!=0){DO_THING_2;}\
    if(((N)&4)!=0){DO_THING_4;}\
    if(((N)&8)!=0){DO_THING_8;}\
    if(((N)&16)!=0){DO_THING_16;}

回答by harper

You can't use a #define construct to calculate the "unroll-count". But with sufficient macros you can define this:

您不能使用 #define 构造来计算“展开计数”。但是有了足够的宏,你可以定义这个:

#define LOOP1(a) a
#define LOOP2(a) a LOOP1(a)
#define LOOP3(a) a LOOP2(a)

#define LOOPN(n,a) LOOP##n(a)

int main(void)
{
    LOOPN(3,printf("hello,world"););
}

Tested with VC2012

用VC2012测试

回答by dmg

You can't write realrecursive statements with macros and I'm pretty sure you can't have realiteration in macros as well.

你不能用宏编写真正的递归语句,我很确定你也不能在宏中进行真正的迭代。

However you can take a look at Order. Although it is entirely built atop the C preprocessor it "implements" iteration-like functionalities. It actually can have up-to-N iterations, where N is some large number. I'm guessing it's similar for "recursive" macros. Any way, it is such a borderline case that few compilers support it (GCC is one of them, though).

但是,您可以查看Order。尽管它完全建立在 C 预处理器之上,但它“实现”了类似迭代的功能。它实际上最多可以有 N 次迭代,其中 N 是一些大数。我猜它与“递归”宏类似。无论如何,这是一种临界情况,很少有编译器支持它(不过,GCC 就是其中之一)。