如何从 C 程序刷新 Linux 中的 CPU 缓存?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11277984/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to flush the CPU cache in Linux from a C program?
提问by Rose BEck
I am writing a C program in which I need to flush my memory. I would like know if there is any UNIX system command to flush the CPU cache.
我正在编写一个需要刷新内存的 C 程序。我想知道是否有任何 UNIX 系统命令来刷新 CPU 缓存。
This is a requirement for my project which involves calculating the time taken for my logic.
这是我的项目的要求,涉及计算我的逻辑所花费的时间。
I have read about the cacheflush(char *s, int a, int b)
function but I am not sure as to whether it will be suitable and what to pass in the parameters.
我已经阅读了有关该cacheflush(char *s, int a, int b)
函数的信息,但我不确定它是否合适以及传入参数的内容。
回答by paulsm4
I take it you mean "CPU cache", not memory cache
The link above is good: the suggestion "write a lot of data via CPU" is notWindows specific
Here's another variation on the same theme:
Here's an article about Linux and CPU cache:
我认为您的意思是“CPU 缓存”,而不是内存缓存
上面的链接很好:建议“通过 CPU 写入大量数据”不是Windows 特定的
这是同一主题的另一个变体:
这是一篇关于 Linux 和 CPU 缓存的文章:
NOTE:
笔记:
At this (very, very low) level, "Linux" != "Unix"
在这个(非常非常低)级别,“Linux”!=“Unix”
回答by phonetagger
If you're writing a user-mode (not kernel-mode) program, and if it's single-threaded, then there's really no reason for you to ever bother flushing your cache in the first place. Your user-mode program can just forget that it even exists; it's just there to speed up your program's execution, and the OS manages it via the processor's MMU.
如果您正在编写一个用户模式(而不是内核模式)程序,并且它是单线程的,那么您真的没有理由一开始就费心去刷新缓存。您的用户模式程序可能会忘记它甚至存在;它只是为了加速程序的执行,操作系统通过处理器的 MMU 管理它。
There are only a couple reasons I can think of that you might actually want to flush the cache from your user-mode application:
我能想到的原因只有几个,你可能真的想从你的用户模式应用程序中刷新缓存:
- Your app is intended to run on a symmetric multiprocessor system, or has data transactions with external hardware)
- You're simply testing your cache for some sort of performance test (in which case you should probably really should be writing your test to operate in kernel mode, perhaps as a driver).
- 您的应用旨在在对称多处理器系统上运行,或与外部硬件进行数据交易)
- 您只是为了某种性能测试而测试您的缓存(在这种情况下,您可能真的应该编写测试以在内核模式下运行,也许作为驱动程序)。
In any case, assuming you're using Linux...
无论如何,假设您使用的是 Linux ...
#include <asm/cachectl.h>
int cacheflush(char *addr, int nbytes, int cache);
This assumes you have a block of memory you just wrote to and you want to make sure it's flushed out of the cache back to main memory. The block begins at addr, and it's nbytes long, and it's in one of the two caches (or both):
这假设您有一个刚刚写入的内存块,并且您想确保它从缓存中刷新回主内存。该块从 addr 开始,长度为 nbytes,位于两个缓存之一(或两者)中:
ICACHE Flush the instruction cache.
DCACHE Write back to memory and invalidate the affected valid cache lines.
BCACHE Same as (ICACHE|DCACHE).
Normally you'd only need to flush the DCACHE, since when you write data to "memory" (i.e. to the cache), it's normally data, not instructions.
通常您只需要刷新 DCACHE,因为当您将数据写入“内存”(即缓存)时,它通常是数据,而不是指令。
If you want to flush "all of the cache" for some strange testing reason, you could malloc() a big block that you know is larger than your CPU's cache (shoot, make it 8 times as big!), write any old garbage into it, and just flush that entire block.
如果你出于某种奇怪的测试原因想刷新“所有缓存”,你可以 malloc() 一个你知道比你的 CPU 缓存大的大块(射击,让它大 8 倍!),写任何旧的垃圾放入其中,然后刷新整个块。
See also: How to perform cache operations in C++?
另请参阅: 如何在 C++ 中执行缓存操作?
回答by phonetagger
OK, sorry about my first answer. I later read your follow-up comments below your question, so I realize now that you want to flush the INSTRUCTION CACHE to boot your program (or parts of it) out of the cache, so that when you test its performance, you also test its initial load time out of main memory into the instruction cache. Do you also need to flush any data your code will use out to main memory, so that both data and code are fresh loads?
好的,抱歉我的第一个答案。我后来在你的问题下面阅读了你的后续评论,所以我现在意识到你想要刷新指令缓存以从缓存中启动你的程序(或它的一部分),这样当你测试它的性能时,你也测试它的初始加载时间从主内存到指令缓存。您是否还需要将代码将使用的任何数据刷新到主内存中,以便数据和代码都是新加载的?
Before anything else, I'd like to mention that main memory itself is also a form of cache, with your hard disk (either the program on disk, or swap space on disk) being the lowest, slowest place your program's instructions could be coming from. That said, when you first run through a routine for the first time, if it hasn't already been loaded into main memory from disk by virtue of being near other code that has already executed, then its CPU instructions will first have to be loaded from disk. That takes an order of magnitude or more longer than loading it from main memory into the cache. Then once it's loaded into main memory, it takes somewhere along the lines of an order of magnitude longer to load from main memory into the cache than it takes to load from the cache into the CPU's instruction fetcher. So if you want to test your code's cold-start performance, you have to decide what cold-start means.... pulling it out of disk, or pulling it out of main memory. I don't know of any command to "flush" instructions/data out of main memory out to swap space, so flushing it out to main memory is about as much as you can do (that I know of), but keep in mind that your test results may still differ from the first run (when it may be pulling it off disk) to subsequent runs, even if you do flush the instruction cache.
首先,我想提一下,主存本身也是一种缓存形式,硬盘(磁盘上的程序或磁盘上的交换空间)是程序指令可能到达的最低、最慢的地方从。也就是说,当你第一次运行一个例程时,如果它还没有从磁盘加载到主内存中,因为它靠近其他已经执行的代码,那么它的 CPU 指令将首先被加载从磁盘。这比从主内存加载到缓存需要一个数量级或更长的时间。然后,一旦将其加载到主内存中,从主内存加载到缓存中所需的时间比从缓存加载到 CPU 的指令提取器所需的时间长一个数量级。所以如果你想测试你的代码的冷启动性能,你必须决定冷启动是什么意思……把它从磁盘中拉出来,或者从主内存中拉出来。我不知道有任何命令可以将主内存中的指令/数据“刷新”到交换空间,因此将其刷新到主内存中的次数与您所能做的差不多(我知道),但请记住即使您确实刷新了指令缓存,您的测试结果仍可能与第一次运行(当它可能将其从磁盘上拉出时)和后续运行不同。
Now, how would one go about flushing the instruction cache to ensure that their own code is flushed out to main memory?
现在,如何刷新指令缓存以确保他们自己的代码被刷新到主内存?
If I needed to do this (very odd thing to do in my opinion), I'd probably start by finding the length & approximate placement of my functions in memory. Since I'm using Linux, I'd issue the command "objdump -d {myprogram} > myprogram.dump.txt", then I'd open myprogram.dump.txt in an editor and search for the functions I want to flush out, and figure out how long they are by subtracting their end address form their start address using a hex calculator. I'd write down the sizes of each. Later I'd add cacheflush() calls in my code, giving it the address of each function I want to flush out as 'addr' and the length I found as 'nbytes', and ICACHE. Just for safety I'd probably fudge a little & add about 10% to the size, just in case I make a few tweaks to the code and forget to adjust the nbytes. I'd make a call to cacheflush() like this for each function I want to flush out. Then if I need to flush out the data also, if it's using global/static data, I can flush those also (DCACHE), but if it's stack or heap data, there's really nothing realistic that I can (or should) do to flush that out of cache. Trying to do so would be an exercise in silliness, because it would be creating a condition that would never or very rarely exist in normal execution. Assuming you're using Linux...
如果我需要这样做(在我看来这是一件很奇怪的事情),我可能会首先找到我的函数在内存中的长度和大致位置。由于我使用的是 Linux,我会发出命令“objdump -d {myprogram} > myprogram.dump.txt”,然后我会在编辑器中打开 myprogram.dump.txt 并搜索我想要刷新的函数出,并通过使用十六进制计算器从它们的起始地址中减去它们的结束地址来计算它们的长度。我会写下每个的大小。稍后我会在我的代码中添加 cacheflush() 调用,将我想要刷新的每个函数的地址作为“addr”和我找到的长度作为“nbytes”和 ICACHE。为了安全起见,我可能会捏造一点,并在尺寸上增加大约 10%,以防万一我对代码进行了一些调整而忘记调整 nbytes。对于我想要刷新的每个函数,我都会像这样调用 cacheflush() 。然后,如果我还需要刷新数据,如果它使用全局/静态数据,我也可以刷新这些数据(DCACHE),但是如果它是堆栈或堆数据,我真的没有什么可以(或应该)做来刷新出缓存。尝试这样做将是愚蠢的练习,因为它会创造一个在正常执行中永远不会或很少存在的条件。假设您使用的是 Linux... 我可以(或应该)将其从缓存中刷新出来,这真的很不现实。尝试这样做将是愚蠢的练习,因为它会创造一个在正常执行中永远不会或很少存在的条件。假设您使用的是 Linux... 我可以(或应该)将其从缓存中刷新出来,这真的很不现实。尝试这样做将是愚蠢的练习,因为它会创造一个在正常执行中永远不会或很少存在的条件。假设您使用的是 Linux...
#include <asm/cachectl.h>
int cacheflush(char *addr, int nbytes, int cache);
...where cache is one of:
ICACHE Flush the instruction cache.
DCACHE Write back to memory and invalidate the affected valid cache lines.
BCACHE Same as (ICACHE|DCACHE).
BTW, is this homework for a class?
BTW,这是课堂作业吗?
回答by user7940320
This is how Intel suggests flushing the cache:
这就是英特尔建议刷新缓存的方式:
mem_flush(const void *p, unsigned int allocation_size){
const size_t cache_line = 64;
const char *cp = (const char *)p;
size_t i = 0;
if (p == NULL || allocation_size <= 0)
return;
for (i = 0; i < allocation_size; i += cache_line) {
asm volatile("clflush (%0)\n\t"
:
: "r"(&cp[i])
: "memory");
}
asm volatile("sfence\n\t"
:
:
: "memory");
}