有效性测试指针 (C/C++)

Question

提问by noamtm

Is there any way to determine (programatically, of course) if a given pointer is "valid"? Checking for NULL is easy, but what about things like 0x00001234? When trying to dereference this kind of pointer an exception/crash occurs.

有什么方法可以确定（当然是以编程方式）给定的指针是否“有效”？检查 NULL 很容易，但是像 0x00001234 这样的东西呢？当尝试取消引用这种指针时，会发生异常/崩溃。

A cross-platform method is preferred, but platform-specific (for Windows and Linux) is also ok.

首选跨平台方法，但特定于平台的方法（适用于 Windows 和 Linux）也可以。

Update for clarification:The problem is not with stale/freed/uninitialized pointers; instead, I'm implementing an API that takes pointers from the caller (like a pointer to a string, a file handle, etc.). The caller can send (in purpose or by mistake) an invalid value as the pointer. How do I prevent a crash?

更新澄清：问题不在于陈旧/释放/未初始化的指针；相反，我正在实现一个 API，它从调用者那里获取指针（如指向字符串的指针、文件句柄等）。调用者可以（有意或错误地）发送无效值作为指针。如何防止崩溃？

Answer 1

采纳答案by Johannes Schaub - litb

Update for clarification:The problem is not with stale, freed or uninitialized pointers; instead, I'm implementing an API that takes pointers from the caller (like a pointer to a string, a file handle, etc.). The caller can send (in purpose or by mistake) an invalid value as the pointer. How do I prevent a crash?

更新澄清：问题不在于陈旧、释放或未初始化的指针；相反，我正在实现一个 API，它从调用者那里获取指针（如指向字符串的指针、文件句柄等）。调用者可以（有意或错误地）发送无效值作为指针。如何防止崩溃？

You can't make that check. There is simply no way you can check whether a pointer is "valid". You have to trust that when people use a function that takes a pointer, those people know what they are doing. If they pass you 0x4211 as a pointer value, then you have to trust it points to address 0x4211. And if they "accidentally" hit an object, then even if you would use some scary operation system function (IsValidPtr or whatever), you would still slip into a bug and not fail fast.

你不能做那个检查。您根本无法检查指针是否“有效”。您必须相信，当人们使用带指针的函数时，这些人知道他们在做什么。如果他们将 0x4211 作为指针值传递给您，那么您必须相信它指向地址 0x4211。如果他们“不小心”撞到了一个对象，那么即使你使用了一些可怕的操作系统函数（IsValidPtr 或其他什么），你仍然会陷入错误并且不会很快失败。

Start using null pointers for signaling this kind of thing and tell the user of your library that they should not use pointers if they tend to accidentally pass invalid pointers, seriously :)

开始使用空指针来发出这种信号，并告诉您的库的用户，如果他们倾向于不小心传递无效指针，他们不应该使用指针，真的:)

Answer 2

回答by George Carrette

Here are three easy ways for a C program under Linux to get introspective about the status of the memory in which it is running, and why the question has appropriate sophisticated answers in some contexts.

Linux 下的 C 程序可以通过以下三种简单的方法反省它正在运行的内存的状态，以及为什么这个问题在某些情况下有适当的复杂答案。

After calling getpagesize() and rounding the pointer to a page boundary, you can call mincore() to find out if a page is valid and if it happens to be part of the process working set. Note that this requires some kernel resources, so you should benchmark it and determine if calling this function is really appropriate in your api. If your api is going to be handling interrupts, or reading from serial ports into memory, it is appropriate to call this to avoid unpredictable behaviors.
After calling stat() to determine if there is a /proc/self directory available, you can fopen and read through /proc/self/maps to find information about the region in which a pointer resides. Study the man page for proc, the process information pseudo-file system. Obviously this is relatively expensive, but you might be able to get away with caching the result of the parse into an array you can efficiently lookup using a binary search. Also consider the /proc/self/smaps. If your api is for high-performance computing then the program will want to know about the /proc/self/numa which is documented under the man page for numa, the non-uniform memory architecture.
The get_mempolicy(MPOL_F_ADDR) call is appropriate for high performance computing api work where there are multiple threads of execution and you are managing your work to have affinity for non-uniform memory as it relates to the cpu cores and socket resources. Such an api will of course also tell you if a pointer is valid.

在调用 getpagesize() 并将指针四舍五入到页面边界后，您可以调用 mincore() 来确定页面是否有效以及它是否恰好是进程工作集的一部分。请注意，这需要一些内核资源，因此您应该对其进行基准测试并确定在您的 api 中调用此函数是否真的合适。如果您的 api 将处理中断，或从串行端口读取到内存中，则调用它以避免不可预测的行为是合适的。
调用stat()确定是否有/proc/self目录可用后，可以打开并通读/proc/self/maps来查找指针所在区域的信息。研究 proc 的手册页，进程信息伪文件系统。显然，这是相对昂贵的，但您可能能够将解析结果缓存到一个数组中，您可以使用二进制搜索有效地查找。还要考虑 /proc/self/smaps。如果您的 api 用于高性能计算，那么程序将想要了解 /proc/self/numa，它记录在 numa（非统一内存架构）的手册页下。
get_mempolicy(MPOL_F_ADDR) 调用适用于有多个执行线程的高性能计算 api 工作，并且您正在管理您的工作以对非均匀内存具有亲和力，因为它与 cpu 内核和套接字资源相关。这样的 api 当然也会告诉你一个指针是否有效。

Under Microsoft Windows there is the function QueryWorkingSetEx that is documented under the Process Status API (also in the NUMA API). As a corollary to sophisticated NUMA API programming this function will also let you do simple "testing pointers for validity (C/C++)" work, as such it is unlikely to be deprecated for at least 15 years.

在 Microsoft Windows 下，进程状态 API（也在 NUMA API 中）下记录了函数 QueryWorkingSetEx。作为复杂 NUMA API 编程的必然结果，此函数还可以让您执行简单的“测试指针有效性 (C/C++)”工作，因此它至少在 15 年内不会被弃用。

Answer 3

回答by Nailer

Preventing a crash caused by the caller sending in an invalid pointer is a good way to make silent bugs that are hard to find.

防止由调用者发送无效指针引起的崩溃是一种制造难以发现的无声错误的好方法。

Isn't it better for the programmer using your API to get a clear message that his code is bogus by crashing it rather than hiding it?

对于使用您的 API 的程序员来说，通过崩溃而不是隐藏它来获得明确的信息，表明他的代码是伪造的不是更好吗？

Answer 4

回答by JaredPar

On Win32/64 there is a way to do this. Attempt to read the pointer and catch the resulting SEH exeception that will be thrown on failure. If it doesn't throw, then it's a valid pointer.

在 Win32/64 上，有一种方法可以做到这一点。尝试读取指针并捕获将在失败时抛出的结果 SEH 异常。如果它没有抛出，那么它是一个有效的指针。

The problem with this method though is that it just returns whether or not you can read data from the pointer. It makes no guarantee about type safety or any number of other invariants. In general this method is good for little else other than to say "yes, I can read that particular place in memory at a time that has now passed".

但是，此方法的问题在于它只返回您是否可以从指针读取数据。它不保证类型安全或任何数量的其他不变量。一般来说，除了说“是的，我可以在现在已经过去的时间读取内存中的那个特定位置”之外，这种方法没什么用处。

In short, Don't do this ;)

简而言之，不要这样做;)

Raymond Chen has a blog post on this subject: http://blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx

Raymond Chen 有一篇关于这个主题的博客文章：http: //blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx

Answer 5

回答by Ferdinand Beyer

AFAIK there is no way. You should try to avoid this situation by always setting pointers to NULL after freeing memory.

AFAIK 没有办法。您应该在释放内存后始终将指针设置为 NULL 来避免这种情况。

Answer 6

回答by tunnuz

Take a look to thisand thisquestion. Also take a look to smart pointers.

看看这个和这个问题。也看看智能指针。

Answer 7

回答by Fredrik

Regarding the answer a bit up in this thread:

关于这个线程中的答案：

IsBadReadPtr(), IsBadWritePtr(), IsBadCodePtr(), IsBadStringPtr() for Windows.

对于 Windows，IsBadReadPtr()、IsBadWritePtr()、IsBadCodePtr()、IsBadStringPtr()。

My advice is to stay away from them, someone has already posted this one: http://blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx

我的建议是远离他们，有人已经发布了这个：http: //blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx

Another post on the same topic and by the same author (I think) is this one: http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx("IsBadXxxPtr should really be called CrashProgramRandomly").

关于同一主题和同一作者（我认为）的另一篇文章是这样的：http: //blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx（“IsBadXxxPtr 真的应该被称为 CrashProgramRandomly ”）。

If the users of your API sends in bad data, let it crash. If the problem is that the data passed isn't used until later (and that makes it harder to find the cause), add a debug mode where the strings etc. are logged at entry. If they are bad it will be obvious (and probably crash). If it is happening way to often, it might be worth moving your API out of process and let them crash the API process instead of the main process.

如果您的 API 的用户发送了错误的数据，让它崩溃。如果问题是传递的数据直到稍后才使用（这使得查找原因变得更加困难），请添加调试模式，其中字符串等在入口处记录。如果它们很糟糕，那将是显而易见的（并且可能会崩溃）。如果它经常发生，可能值得将您的 API 移出进程并让它们使 API 进程而不是主进程崩溃。

Answer 8

回答by Dipstick

Firstly, I don't see any point in trying to protect yourself from the caller deliberately trying to cause a crash. They could easily do this by trying to access through an invalid pointer themselves. There are many other ways - they could just overwrite your memory or the stack. If you need to protect against this sort of thing then you need to be running in a separate process using sockets or some other IPC for communication.

首先，我认为试图保护自己免受故意造成崩溃的呼叫者的伤害没有任何意义。他们可以通过自己尝试通过无效指针进行访问来轻松做到这一点。还有许多其他方法 - 它们可以覆盖您的内存或堆栈。如果您需要防止此类事情发生，那么您需要在使用套接字或其他一些 IPC 进行通信的单独进程中运行。

We write quite a lot of software that allows partners/customers/users to extend functionality. Inevitably any bug gets reported to us first so it is useful to be able to easily show that the problem is in the plug-in code. Additionally there are security concerns and some users are more trusted than others.

我们编写了很多软件，允许合作伙伴/客户/用户扩展功能。不可避免地，任何错误都会首先报告给我们，因此能够轻松地表明问题出在插件代码中很有用。此外，还有安全问题，一些用户比其他用户更受信任。

We use a number of different methods depending on performance/throughput requirements and trustworthyness. From most preferred:

我们根据性能/吞吐量要求和可信度使用多种不同的方法。从最喜欢的：

separate processes using sockets (often passing data as text).
separate processes using shared memory (if large amounts of data to pass).
same process separate threads via message queue (if frequent short messages).
same process separate threads all passed data allocated from a memory pool.
same process via direct procedure call - all passed data allocated from a memory pool.

使用套接字分离进程（通常将数据作为文本传递）。
使用共享内存的单独进程（如果要传递大量数据）。
同一进程通过消息队列分离线程（如果频繁的短消息）。
同一进程单独的线程都通过从内存池分配的数据。
通过直接过程调用相同的过程 - 从内存池分配的所有传递的数据。

We try never to resort to what you are trying to do when dealing with third party software - especially when we are given the plug-ins/library as binary rather than source code.

在与第三方软件打交道时，我们永远不会求助于您尝试做的事情 - 特别是当我们以二进制而不是源代码的形式提供插件/库时。

Use of a memory pool is quite easy in most circumstances and needn't be inefficient. If YOU allocate the data in the first place then it is trivial to check the pointers against the values you allocated. You could also store the length allocated and add "magic" values before and after the data to check for valid data type and data overruns.

在大多数情况下，内存池的使用非常容易，而且效率不会低下。如果您首先分配数据，那么根据您分配的值检查指针是微不足道的。您还可以存储分配的长度并在数据前后添加“魔术”值以检查有效的数据类型和数据溢出。

Answer 9

回答by Mike Sadler

I've got a lot of sympathy with your question, as I'm in an almost identical position myself. I appreciate what a lot of the replies are saying, and they are correct - the routine supplying the pointer shouldbe providing a valid pointer. In my case, it is almost inconceivable that they could have corrupted the pointer - but if they hadmanaged, it would be MY software that crashes, and ME that would get the blame :-(

我很同情你的问题，因为我自己也处于几乎相同的位置。我很欣赏很多回复所说的，它们是正确的——提供指针的例程应该提供一个有效的指针。在我的情况下，几乎是不可想象的，他们可能已经损坏的指针-但如果他们有管理的，那将是我的软件崩溃，并ME将得到怪:-(

My requirement isn't that I continue after a segmentation fault - that would be dangerous - I just want to report what happened to the customer before terminating so that they can fix their code rather than blaming me!

我的要求不是我在分段错误后继续 - 这会很危险 - 我只想在终止之前报告客户发生的事情，以便他们可以修复他们的代码而不是责怪我！

This is how I've found to do it (on Windows): http://www.cplusplus.com/reference/clibrary/csignal/signal/

这就是我发现的方法（在 Windows 上）：http: //www.cplusplus.com/reference/clibrary/csignal/signal/

To give a synopsis:

提供一个概要：

#include <signal.h>

using namespace std;

void terminate(int param)
/// Function executed if a segmentation fault is encountered during the cast to an instance.
{
  cerr << "\nThe function received a corrupted reference - please check the user-supplied  dll.\n";
  cerr << "Terminating program...\n";
  exit(1);
}

...
void MyFunction()
{
    void (*previous_sigsegv_function)(int);
    previous_sigsegv_function = signal(SIGSEGV, terminate);

    <-- insert risky stuff here -->

    signal(SIGSEGV, previous_sigsegv_function);
}

Now this appearsto behave as I would hope (it prints the error message, then terminates the program) - but if someone can spot a flaw, please let me know!

现在这看起来像我希望的那样（它打印错误消息，然后终止程序） - 但如果有人能发现缺陷，请告诉我！

Answer 10

回答by Peeter Joot

On Unix you should be able to utilize a kernel syscall that does pointer checking and returns EFAULT, such as:

在 Unix 上，您应该能够利用内核系统调用来进行指针检查并返回 EFAULT，例如：

#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdbool.h>

bool isPointerBad( void * p )
{
   int fh = open( p, 0, 0 );
   int e = errno;

   if ( -1 == fh && e == EFAULT )
   {
      printf( "bad pointer: %p\n", p );
      return true;
   }
   else if ( fh != -1 )
   {
      close( fh );
   }

   printf( "good pointer: %p\n", p );
   return false;
}

int main()
{
   int good = 4;
   isPointerBad( (void *)3 );
   isPointerBad( &good );
   isPointerBad( "/tmp/blah" );

   return 0;
}

returning:

返回：

bad pointer: 0x3
good pointer: 0x7fff375fd49c
good pointer: 0x400793

There's probably a better syscall to use than open() [perhaps access], since there's a chance that this could lead to actual file creation codepath, and a subsequent close requirement.

可能有比 open() [可能访问] 更好的系统调用，因为这可能会导致实际的文件创建代码路径，以及随后的关闭要求。

有效性测试指针 (C/C++)

提问by noamtm

采纳答案by Johannes Schaub - litb

回答by George Carrette

回答by Nailer

回答by JaredPar

回答by Ferdinand Beyer

回答by tunnuz

回答by Fredrik

回答by Dipstick

回答by Mike Sadler

回答by Peeter Joot

相关推荐

最近更新

标签

有效性测试指针 (C/C++)

提问by noamtm

采纳答案by Johannes Schaub - litb

回答by George Carrette

回答by Nailer

回答by JaredPar

回答by Ferdinand Beyer

回答by tunnuz

回答by Fredrik

回答by Dipstick

回答by Mike Sadler

回答by Peeter Joot

相关推荐

从 C++ 调用 DLL 中的函数

C++ 如何在 GoogleTest 中运行特定的测试用例

C++ 如何使用 FILE* 写入内存缓冲区？

C++ 初始化静态 const 结构变量

相关推荐

最近更新

标签