C语言如何在C中对齐指针

Question

提问by Mark

Is there a way to align a pointer in C? Suppose I'm writing data to an array stack (so the pointer goes downward) and I want the next data I write to be 4-aligned so the data is written at a memory location which is a multiple of 4, how would I do that?

有没有办法在C中对齐指针？假设我正在将数据写入数组堆栈（因此指针向下）并且我希望我写入的下一个数据是 4 对齐的，因此数据被写入一个 4 的倍数的内存位置，我该怎么做那？

I have

我有

 uint8_t ary[1024];
 ary = ary+1024;
 ary -= /* ... */

Now suppose that arypoints at location 0x05. I want it to point to 0x04. Now I could just do

现在假设ary指向 location 0x05。我希望它指向0x04. 现在我可以做

ary -= (ary % 4);

but C doesn't allow modulo on pointers. Is there any solution that is architecture independent?

但 C 不允许对指针取模。有没有与架构无关的解决方案？

Answer 1

回答by Jonathan Leffler

Arrays are NOTpointers, despite anything you may have read in misguided answers here (meaning this question in particular or Stack Overflow in general — or anywhere else).

数组不是指针，尽管您可能在此处的误导性答案中读到了任何内容（特别是指这个问题或一般的 Stack Overflow 或其他任何地方）。

You cannot alter the value represented by the name of an array as shown.

您不能更改由数组名称表示的值，如图所示。

What is confusing, perhaps, is that if aryis a function parameter, it will appear that you can adjust the array:

令人困惑的是，如果ary是函数参数，则似乎可以调整数组：

void function(uint8_t ary[1024])
{
    ary += 213; // No problem because ary is a uint8_t pointer, not an array
    ...
}

Arrays as parameters to functions are different from arrays defined either outside a function or inside a function.

作为函数参数的数组不同于在函数外部或函数内部定义的数组。

You can do:

你可以做：

uint8_t    ary[1024];
uint8_t   *stack = ary + 510;
uintptr_t  addr  = (uintptr_t)stack;

if (addr % 8 != 0)
    addr += 8 - addr % 8;
stack = (uint8_t *)addr;

This ensures that the value in stackis aligned on an 8-byte boundary, rounded up. Your question asks for rounding down to a 4-byte boundary, so the code changes to:

这确保了 in 的值在stack8 字节边界上对齐，四舍五入。您的问题要求四舍五入到 4 字节边界，因此代码更改为：

if (addr % 4 != 0)
    addr -= addr % 4;
stack = (uint8_t *)addr;

Yes, you can do that with bit masks too. Either:

是的，你也可以用位掩码做到这一点。任何一个：

addr = (addr + (8 - 1)) & -8;  // Round up to 8-byte boundary

or:

或者：

addr &= -4;                    // Round down to a 4-byte boundary

This only works correctly if the LHS is a power of two — not for arbitrary values. The code with modulus operations will work correctly for any (positive) modulus.

这仅在 LHS 是 2 的幂时才能正常工作 - 不适用于任意值。带有模数运算的代码对于任何（正）模数都可以正常工作。

See also: How to allocate aligned memory using only the standard library.

另请参阅：如何仅使用标准库分配对齐的内存。

Demo code

演示代码

Gnzlbg commented:

Gnzlbg 评论：

The code for a power of two breaks if I try to align e.g. uintptr_t(2) up to a 1 byte boundary (both are powers of 2: 2^1 and 2^0). The result is 1 but should be 2 since 2 is already aligned to a 1 byte boundary.

如果我尝试将例如 uintptr_t(2) 对齐到 1 字节边界（都是 2 的幂：2^1 和 2^0），则两次中断的幂的代码。结果是 1，但应该是 2，因为 2 已经与 1 字节边界对齐。

This code demonstrates that the alignment code is OK — as long as you interpret the comments just above correctly (now clarified by the 'either or' words separating the bit masking operations; I got caught when first checking the code).

这段代码表明对齐代码是可以的——只要你正确解释了上面的注释（现在通过分隔位掩码操作的“非此即彼”词来澄清；我在第一次检查代码时被发现了）。

The alignment functions could be written more compactly, especially without the assertions, but the compiler will optimize to produce the same code from what is written and what could be written. Some of the assertions could be made more stringent, too. And maybe the test function should print out the base address of the stack before doing anything else.

对齐函数可以写得更紧凑，尤其是在没有断言的情况下，但编译器将进行优化以根据已编写的内容和可以编写的内容生成相同的代码。一些断言也可以变得更加严格。也许测试函数应该在做任何其他事情之前打印出堆栈的基地址。

The code could, and maybe should, check that there won't be numeric overflow or underflow with the arithmetic. This would be more likely a problem if you aligned addresses to a multi-megabyte boundary; while you keep under 1 KiB, alignments, you're unlikely to find a problem if you're not attempting to go out of bounds of the arrays you have access to. (Strictly, even if you do multi-megabyte alignments, you won't run into trouble if the result will be within the range of memory allocated to the array you're manipulating.)

代码可以，也许应该，检查算术不会有数字溢出或下溢。如果您将地址对齐到多兆字节边界，这更有可能成为问题；当您保持在 1 KiB 以下对齐时，如果您不试图超出您可以访问的数组的范围，则不太可能发现问题。（严格地说，即使您进行了多兆字节对齐，如果结果在分配给您正在操作的数组的内存范围内，您也不会遇到问题。）

#include <assert.h>
#include <stdint.h>
#include <stdio.h>

/*
** Because the test code works with pointers to functions, the inline
** function qualifier is moot.  In 'real' code using the functions, the
** inline might be useful.
*/

/* Align upwards - arithmetic mode (hence _a) */
static inline uint8_t *align_upwards_a(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    if (addr % align != 0)
        addr += align - addr % align;
    assert(addr >= (uintptr_t)stack);
    return (uint8_t *)addr;
}

/* Align upwards - bit mask mode (hence _b) */
static inline uint8_t *align_upwards_b(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    addr = (addr + (align - 1)) & -align;   // Round up to align-byte boundary
    assert(addr >= (uintptr_t)stack);
    return (uint8_t *)addr;
}

/* Align downwards - arithmetic mode (hence _a) */
static inline uint8_t *align_downwards_a(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    addr -= addr % align;
    assert(addr <= (uintptr_t)stack);
    return (uint8_t *)addr;
}

/* Align downwards - bit mask mode (hence _b) */
static inline uint8_t *align_downwards_b(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    addr &= -align;                         // Round down to align-byte boundary
    assert(addr <= (uintptr_t)stack);
    return (uint8_t *)addr;
}

static inline int inc_mod(int x, int n)
{
    assert(x >= 0 && x < n);
    if (++x >= n)
        x = 0;
    return x;
}

typedef uint8_t *(*Aligner)(uint8_t *addr, uintptr_t align);

static void test_aligners(const char *tag, Aligner align_a, Aligner align_b)
{
    const int align[] = { 64, 32, 16, 8, 4, 2, 1 };
    enum { NUM_ALIGN = sizeof(align) / sizeof(align[0]) };
    uint8_t stack[1024];
    uint8_t *sp = stack + sizeof(stack);
    int dec = 1;
    int a_idx = 0;

    printf("%s\n", tag);
    while (sp > stack)
    {
        sp -= dec++;
        uint8_t *sp_a = (*align_a)(sp, align[a_idx]);
        uint8_t *sp_b = (*align_b)(sp, align[a_idx]);
        printf("old %p, adj %.2d, A %p, B %p\n",
               (void *)sp, align[a_idx], (void *)sp_a, (void *)sp_b);
        assert(sp_a == sp_b);
        sp = sp_a;
        a_idx = inc_mod(a_idx, NUM_ALIGN);
    }
    putchar('\n');
}

int main(void)
{
    test_aligners("Align upwards", align_upwards_a, align_upwards_b);
    test_aligners("Align downwards", align_downwards_a, align_downwards_b);
    return 0;
}

Sample output (partially truncated):

示例输出（部分截断）：

Align upwards
old 0x7fff5ebcf4af, adj 64, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4be, adj 32, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bd, adj 16, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bc, adj 08, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bb, adj 04, A 0x7fff5ebcf4bc, B 0x7fff5ebcf4bc
old 0x7fff5ebcf4b6, adj 02, A 0x7fff5ebcf4b6, B 0x7fff5ebcf4b6
old 0x7fff5ebcf4af, adj 01, A 0x7fff5ebcf4af, B 0x7fff5ebcf4af
old 0x7fff5ebcf4a7, adj 64, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b7, adj 32, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b6, adj 16, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b5, adj 08, A 0x7fff5ebcf4b8, B 0x7fff5ebcf4b8
old 0x7fff5ebcf4ac, adj 04, A 0x7fff5ebcf4ac, B 0x7fff5ebcf4ac
old 0x7fff5ebcf49f, adj 02, A 0x7fff5ebcf4a0, B 0x7fff5ebcf4a0
old 0x7fff5ebcf492, adj 01, A 0x7fff5ebcf492, B 0x7fff5ebcf492
…
old 0x7fff5ebcf0fb, adj 08, A 0x7fff5ebcf100, B 0x7fff5ebcf100
old 0x7fff5ebcf0ca, adj 04, A 0x7fff5ebcf0cc, B 0x7fff5ebcf0cc
old 0x7fff5ebcf095, adj 02, A 0x7fff5ebcf096, B 0x7fff5ebcf096

Align downwards
old 0x7fff5ebcf4af, adj 64, A 0x7fff5ebcf480, B 0x7fff5ebcf480
old 0x7fff5ebcf47e, adj 32, A 0x7fff5ebcf460, B 0x7fff5ebcf460
old 0x7fff5ebcf45d, adj 16, A 0x7fff5ebcf450, B 0x7fff5ebcf450
old 0x7fff5ebcf44c, adj 08, A 0x7fff5ebcf448, B 0x7fff5ebcf448
old 0x7fff5ebcf443, adj 04, A 0x7fff5ebcf440, B 0x7fff5ebcf440
old 0x7fff5ebcf43a, adj 02, A 0x7fff5ebcf43a, B 0x7fff5ebcf43a
old 0x7fff5ebcf433, adj 01, A 0x7fff5ebcf433, B 0x7fff5ebcf433
old 0x7fff5ebcf42b, adj 64, A 0x7fff5ebcf400, B 0x7fff5ebcf400
old 0x7fff5ebcf3f7, adj 32, A 0x7fff5ebcf3e0, B 0x7fff5ebcf3e0
old 0x7fff5ebcf3d6, adj 16, A 0x7fff5ebcf3d0, B 0x7fff5ebcf3d0
old 0x7fff5ebcf3c5, adj 08, A 0x7fff5ebcf3c0, B 0x7fff5ebcf3c0
old 0x7fff5ebcf3b4, adj 04, A 0x7fff5ebcf3b4, B 0x7fff5ebcf3b4
old 0x7fff5ebcf3a7, adj 02, A 0x7fff5ebcf3a6, B 0x7fff5ebcf3a6
old 0x7fff5ebcf398, adj 01, A 0x7fff5ebcf398, B 0x7fff5ebcf398
…
old 0x7fff5ebcf0f7, adj 01, A 0x7fff5ebcf0f7, B 0x7fff5ebcf0f7
old 0x7fff5ebcf0d3, adj 64, A 0x7fff5ebcf0c0, B 0x7fff5ebcf0c0
old 0x7fff5ebcf09b, adj 32, A 0x7fff5ebcf080, B 0x7fff5ebcf080

Answer 2

回答by Mark

DO NOT USE MODULO!!! IT IS REALLY SLOW!!! Hands down the fastest way to align a pointer is to use 2's complement math. You need to invert the bits, add one, and mask off the 2 (for 32-bit) or 3 (for 64-bit) least significant bits. The result is an offset that you then add to the pointer value to align it. Works great for 32 and 64-bit numbers. For 16-bit alignment just mask the pointer with 0x1 and add that value. Algorithm works identically in any language but as you can see, Embedded C++ is vastly superior than C in every way shape and form.

不要使用模数！！！真的很慢！！！对齐指针的最快方法是使用 2 的补码数学。您需要反转位，加一，然后屏蔽掉 2（对于 32 位）或 3（对于 64 位）最低有效位。结果是一个偏移量，然后将其添加到指针值以对齐它。适用于 32 位和 64 位数字。对于 16 位对齐，只需用 0x1 屏蔽指针并添加该值。算法在任何语言中的工作原理都相同，但正如您所见，嵌入式 C++ 在各方面的形状和形式上都远远优于 C。

#include <cstdint>
/** Returns the number to add to align the given pointer to a 8, 16, 32, or 64-bit 
    boundary.
    @author Cale McCollough.
    @param  ptr The address to align.
    @return The offset to add to the ptr to align it. */
template<typename T>
inline uintptr_t MemoryAlignOffset (const void* ptr) {
    return ((~reinterpret_cast<uintptr_t> (ptr)) + 1) & (sizeof (T) - 1);
}

/** Word aligns the given byte pointer up in addresses.
    @author Cale McCollough.
    @param ptr Pointer to align.
    @return Next word aligned up pointer. */
template<typename T>
inline T* MemoryAlign (T* ptr) {
    uintptr_t offset = MemoryAlignOffset<uintptr_t> (ptr);
    char* aligned_ptr = reinterpret_cast<char*> (ptr) + offset;
    return reinterpret_cast<T*> (aligned_ptr);
}

For detailed write up and proofs please @see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Fastest-Method-to-Align-Pointers. If you would like to see proof of why you should never use modulo, I invented the world fastest integer-to-string algorithm. The benchmark on the paper shows you the effect of optimizing away just one modulo instruction. Please @see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Engineering-a-Faster-Integer-to-String-Algorithm.

有关详细的撰写和证明，请@see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Fastest-Method-to-Align-Pointers。如果你想看看为什么不应该使用模的证据，我发明了世界上最快的整数到字符串算法。论文中的基准向您展示了优化掉一个模指令的效果。请@see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Engineering-a-Faster-Integer-to-String-Algorithm。

Answer 3

回答by Ebrahim Byagowi

Based on tricks learned elsewhere and one from reading @par answer apparently all I needed for my special case which is for a 32-bit like machine is ((size - 1) | 3) + 1which acts like this and thought might be useful for other,

基于在其他地方学到的技巧，以及从阅读@par 答案中学到的技巧，显然我需要为我的特殊情况，这是一个 32 位的机器，((size - 1) | 3) + 1它的行为是这样的，并且认为可能对其他人有用，

for (size_t size = 0; size < 20; ++size) printf("%d\n", ((size - 1) | 3) + 1);

0
4
4
4
4
8
8
8
8
12
12
12
12
16
16
16
16
20
20
20

Answer 4

回答by mr NAE

For some reason I can't use modulo or bitwise operations. In this case:

出于某种原因，我不能使用模或按位运算。在这种情况下：

void *alignAddress = (void*)((((intptr_t)address + align - 1) / align) * align) ;

For C++:

对于 C++：

template <int align, typename T>
constexpr T padding(T value)
{
    return ((value + align - 1) / align) * align;
}
...
char* alignAddress = reinterpret_cast<char*>(padding<8>(reinterpret_cast<uintptr_t>(address)))

Answer 5

回答by par

I'm editing this answer because:

我正在编辑这个答案，因为：

I had a bug in my original code (I forgot a typecast to intptr_t), and
I'm replying to Jonathan Leffler's criticism in order to clarify my intent.

我的原始代码中有一个错误（我忘记了对的类型转换intptr_t），并且
我正在回复 Jonathan Leffler 的批评，以澄清我的意图。

The code below is not meant to imply you can change the value of an array (foo). But you canget an aligned pointer into that array, and this example illustrates one way to do it.

下面的代码并不意味着您可以更改数组 ( foo)的值。但是，你可以得到一个对齐指针到该数组，这个例子说明这样做的一种方式。

#define         alignmentBytes              ( 1 << 2 )   // == 4, but enforces the idea that that alignmentBytes should be a power of two
#define         alignmentBytesMinusOne      ( alignmentBytes - 1 )

uint8_t         foo[ 1024 + alignmentBytesMinusOne ];
uint8_t         *fooAligned;

fooAligned = (uint8_t *)((intptr_t)( foo + alignmentBytesMinusOne ) & ~alignmentBytesMinusOne);

C语言如何在C中对齐指针

提问by Mark

回答by Jonathan Leffler

Demo code

演示代码

回答by Mark

回答by Ebrahim Byagowi

回答by mr NAE

回答by par

相关推荐

最近更新

标签

C语言 如何在C中对齐指针

提问by Mark

回答by Jonathan Leffler

Demo code

演示代码

回答by Mark

回答by Ebrahim Byagowi

回答by mr NAE

回答by par

相关推荐

C语言 static int a 和 int a 有什么区别？

C语言 按C中元素出现频率的降序对数组进行排序

C语言 如何将整数作为命令行参数？

C语言 字符数组的 strlen 和大小

相关推荐

最近更新

标签

C语言如何在C中对齐指针

C语言按C中元素出现频率的降序对数组进行排序

C语言如何将整数作为命令行参数？

C语言字符数组的 strlen 和大小