Linux Aligning to cache line and knowing the cache line size
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7281699/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Aligning to cache line and knowing the cache line size
提问by MetallicPriest
To prevent false sharing, I want to align each element of an array to a cache line. So first I need to know the size of a cache line, so I assign each element that amount of bytes. Secondly I want the start of the array to be aligned to a cache line.
To prevent false sharing, I want to align each element of an array to a cache line. So first I need to know the size of a cache line, so I assign each element that amount of bytes. Secondly I want the start of the array to be aligned to a cache line.
I am using Linux and 8-core x86 platform. First how do I find the cache line size. Secondly, how do I align to a cache line in C. I am using the gcc compiler.
I am using Linux and 8-core x86 platform. First how do I find the cache line size. Secondly, how do I align to a cache line in C. I am using the gcc compiler.
So the structure would be following for example, assuming a cache line size of 64.
So the structure would be following for example, assuming a cache line size of 64.
element[0] occupies bytes 0-63
element[1] occupies bytes 64-127
element[2] occupies bytes 128-191
and so on, assuming of-course that 0-63 is aligned to a cache line.
and so on, assuming of-course that 0-63 is aligned to a cache line.
采纳答案by Necrolis
To know the sizes, you need to look it up using the documentation for the processor, afaik there is no programatic way to do it. On the plus side however, most cache lines are of a standard size, based on intels standards. On x86 cache lines are 64 bytes, however, to prevent false sharing, you need to follow the guidelines of the processor you are targeting (intel has some special notes on its netburst based processors), generally you need to align to 64 bytes for this (intel states that you should also avoid crossing 16 byte boundries).
To know the sizes, you need to look it up using the documentation for the processor, afaik there is no programatic way to do it. On the plus side however, most cache lines are of a standard size, based on intels standards. On x86 cache lines are 64 bytes, however, to prevent false sharing, you need to follow the guidelines of the processor you are targeting (intel has some special notes on its netburst based processors), generally you need to align to 64 bytes for this (intel states that you should also avoid crossing 16 byte boundries).
To do this in C or C++ requires that you use the standard aligned_alloc
function or one of the compiler specific specifiers such as __attribute__((align(64)))
or __declspec(align(64))
. To pad between members in a struct to split them onto different cache lines, you need on insert a member big enough to align it to the next 64 byte boundery
To do this in C or C++ requires that you use the standard aligned_alloc
function or one of the compiler specific specifiers such as __attribute__((align(64)))
or __declspec(align(64))
. To pad between members in a struct to split them onto different cache lines, you need on insert a member big enough to align it to the next 64 byte boundery
回答by MetallicPriest
posix_memalignor valloccan be used to align allocated memory to a cache line.
posix_memalignor valloccan be used to align allocated memory to a cache line.
回答by Maxim Egorushkin
I am using Linux and 8-core x86 platform. First how do I find the cache line size.
I am using Linux and 8-core x86 platform. First how do I find the cache line size.
$ getconf LEVEL1_DCACHE_LINESIZE
64
Pass the value as a macro definition to the compiler.
Pass the value as a macro definition to the compiler.
$ gcc -DLEVEL1_DCACHE_LINESIZE=`getconf LEVEL1_DCACHE_LINESIZE` ...
At run-time sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
can be used to get L1 cache size.
At run-time sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
can be used to get L1 cache size.
回答by Mysticial
There's no completely portable way to get the cacheline size. But if you're on x86/64, you can call the cpuid
instruction to get everything you need to know about the cache - including size, cacheline size, how many levels, etc...
There's no completely portable way to get the cacheline size. But if you're on x86/64, you can call the cpuid
instruction to get everything you need to know about the cache - including size, cacheline size, how many levels, etc...
http://softpixel.com/~cwright/programming/simd/cpuid.php
http://softpixel.com/~cwright/programming/simd/cpuid.php
(scroll down a little bit, the page is about SIMD, but it has a section getting the cacheline.)
(scroll down a little bit, the page is about SIMD, but it has a section getting the cacheline.)
As for aligning your data structures, there's also no completely portable way to do it. GCC and VS10 have different ways to specify alignment of a struct. One way to "hack" it is to pad your struct with unused variables until it matches the alignment you want.
As for aligning your data structures, there's also no completely portable way to do it. GCC and VS10 have different ways to specify alignment of a struct. One way to "hack" it is to pad your struct with unused variables until it matches the alignment you want.
To align your mallocs(), all the mainstream compilers also have aligned malloc functions for that purpose.
To align your mallocs(), all the mainstream compilers also have aligned malloc functions for that purpose.
回答by Francesquini
Another simple way is to just cat the /proc/cpuinfo:
Another simple way is to just cat the /proc/cpuinfo:
cat /proc/cpuinfo | grep cache_alignment
cat /proc/cpuinfo | grep cache_alignment
回答by Nick Strupat
If anyone is curious about how to do this easily in C++, I've built a library with a CacheAligned<T>
class which handles determining the cache line size as well as the alignment for your T
object, referenced by calling .Ref()
on your CacheAligned<T>
object. You can also use Aligned<typename T, size_t Alignment>
if you know the cache line size beforehand, or just want to stick with the very common value of 64 (bytes).
If anyone is curious about how to do this easily in C++, I've built a library with a CacheAligned<T>
class which handles determining the cache line size as well as the alignment for your T
object, referenced by calling .Ref()
on your CacheAligned<T>
object. You can also use Aligned<typename T, size_t Alignment>
if you know the cache line size beforehand, or just want to stick with the very common value of 64 (bytes).
回答by zoecarver
Here's a tableI made that has most Arm/Intel processors on it. You can use it for reference when defining constants, that way you don't have to generalize the cache line size for all architectures.
Here's a tableI made that has most Arm/Intel processors on it. You can use it for reference when defining constants, that way you don't have to generalize the cache line size for all architectures.
For C++, hopefully, we will soon see hardware interface sizewhich should be an accurate way to get this information (assuming you tell the compiler your target architecture).
For C++, hopefully, we will soon see hardware interface sizewhich should be an accurate way to get this information (assuming you tell the compiler your target architecture).