Java 安全性:如何清除/清零与对象关联的内存?(和/或确保它是特定变量的唯一实例/副本)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6473352/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 16:01:07  来源:igfitidea点击:

Java security: how to clear/zero-out memory associated with an object? (And/or ensure that is the only instance/copy of a particular variable)

javasecuritymemory

提问by weiji

I'm in a discussion at work over how to secure sensitive information (e.g. passwords) stored in a Java program. Per security requirements, memory containing sensitive information is cleared, e.g. by setting the values of the bytes to all zeroes. The concern is that an attacker can observe the memory associated with the application process, and so we want to limit as much as possible the window of time such sensitive information hangs around. Previously, projects involved C++, so a memset() sufficed.

我正在讨论如何保护存储在 Java 程序中的敏感信息(例如密码)。根据安全要求,包含敏感信息的存储器被清除,例如通过将字节值设置为全零。令人担忧的是,攻击者可以观察到与应用程序进程相关的内存,因此我们希望尽可能地限制此类敏感信息存在的时间窗口。以前,项目涉及 C++,因此 memset() 就足够了。

(Incidentally, the use of memset() has been called into question because some compilers are known to optimize it's use out of the resulting binary based on the assumption that, since the memory is not used later, there is no need to zero it in the first place. This blurb is a disclaimer for those who Google for "memset" and "clear memory", etc).

(顺便说一句, memset() 的使用受到了质疑,因为已知某些编译器会根据以下假设优化它在生成的二进制文件中的使用,因为稍后不使用内存,因此无需将其归零首先。这个简介是对那些在谷歌搜索“memset”和“清除记忆”等的人的免责声明)。

Now we have on our hands a Java project being pressed against this requirement.

现在我们手头上有一个 Java 项目正面临这个要求。

For Java objects, my understanding is that:

对于 Java 对象,我的理解是:

  • a nulled reference only changes the value of the reference; the memory on the heap for the object still contains data
  • an immutable object like String would not be able to have it's data modified (or at least not easily, within the confines of a VM with an appropriately enabled security manager)
  • the generational garbage collectors may make copies of objects all over the place (as noted here)
  • 空引用仅更改引用的值;对象的堆上的内存仍然包含数据
  • 像 String 这样的不可变对象无法修改其数据(或者至少不容易,在具有适当启用的安全管理器的 VM 范围内)
  • 世代垃圾收集器可以使对象的副本所有的地方(如注意这里

And for primitives, my understanding is that:

对于原语,我的理解是:

  • a primitive-type variable in a local method would get allocated on the stack, and:
  • when you change it's value, you modify it directly in memory (as opposed to using a reference to handle an object on the heap).
  • copies can/would be made "behind the scenes" in some situations, such as passing it as an argument into methods or boxing (auto- or not) creating instances of the wrappers which contain another primitive variable holding the same value.
  • 本地方法中的原始类型变量将在堆栈上分配,并且:
  • 当你改变它的值时,你直接在内存中修改它(而不是使用引用来处理堆上的对象)。
  • 在某些情况下可以/将在“幕后”制作副本,例如将其作为参数传递给方法或装箱(自动或不自动)创建包装器的实例,其中包含另一个具有相同值的原始变量。

My coworker claims that Java primitives are immutable and that there is documentation from both the NSA and Oracle regarding the lack of support in Java for this requirement.

我的同事声称 Java 原语是不可变的,而且 NSA 和 Oracle 都有关于 Java 缺乏对这一要求的支持的文档。

My position is that primitives can (at least in some situations) be zeroed by setting the value to zero (or boolean to false), and the memory is cleared that way.

我的立场是原语可以(至少在某些情况下)通过将值设置为零(或布尔值设置为 false)来清零,并且内存会以这种方式清除。

I'm trying to verify if there's language in the JLS or other "official" documentation about the required behavior of JVMs when it comes to memory management with respect to primitives. The closest I could find was a "Secure Coding Guidelines for the Java Programming Language"on Oracle's site which mentions clearing char arrays after use.

我试图验证 JLS 或其他“官方”文档中是否有关于 JVM 所需行为的语言,当涉及到与原语相关的内存管理时。我能找到的最接近的是Oracle 网站上的“Java 编程语言的安全编码指南”,其中提到在使用后清除字符数组。

I'd quibble over definitions when my coworker called primitives immutable, but I'm pretty sure he meant "memory cannot be appropriately zeroed" - let's not worry about that. We did not discuss whether he meant final variables - from context we were talking in general.

当我的同事称基元不可变时,我会争论定义,但我很确定他的意思是“内存不能适当地归零”——我们不用担心。我们没有讨论他是否指的是最终变量——从上下文我们一般谈论。

Are there any definitive answers or references on this? I'd appreciate anything that could show me where I'm wrong or confirm that I'm right.

是否有任何明确的答案或参考资料?我会很感激任何可以告诉我我错在哪里或确认我是对的。

Edit: After further discussion, I've been able to clarify that my coworker was thinking of the primitive wrappers, not the primitives themselves. So we are left with the original problem of how to clear memory securely, preferably of objects. Also, to clarify, the sensitive information is not just passwords, but also things like IP addresses or encryption keys.

编辑:经过进一步讨论,我已经能够澄清我的同事正在考虑原始包装器,而不是原始物本身。所以我们剩下的原始问题是如何安全地清除内存,最好是对象。此外,澄清一下,敏感信息不仅仅是密码,还包括 IP 地址或加密密钥等内容。

Are there any commercial JVMs which offer a feature like priority handling of certain objects? (I imagine this would actually violate the Java spec, but I thought I'd ask just in case I'm wrong.)

是否有任何商业 JVM 提供诸如某些对象的优先级处理之类的功能?(我想这实际上会违反 Java 规范,但我想我会问,以防万一我错了。)

回答by Voo

Edit: Actually I just had three ideas that may indeed work - for different values of "work" at least.

编辑:实际上我只有三个可能确实有效的想法 - 至少对于不同的“工作”价值观。

The first that is more or less documented would be ByteBuffer.allocateDirect! As I understand it allocateDirect allocates the buffer outside the usual java heap so won't be copied around. I can't find any hard guarantees about it not getting copied in all situations though - but for the current Hotspot VM that is actually the case (ie it's allocated in an extra heap) and I assume this will stay that way.

第一个或多或少记录在案的将是 ByteBuffer.allocateDirect!据我了解,它在通常的 Java 堆之外分配了缓冲区,因此不会被复制。我找不到任何关于它在所有情况下都不会被复制的硬保证——但对于当前的热点 VM 来说,实际情况是这样(即它被分配在一个额外的堆中),我认为这将保持这种状态。

The second one is using the sun.misc.unsafe package - which as the name says has some rather obvious problems but at least that would be pretty much independent of the used VM - either it's supported (and it works) or it's not (and you get linking errors). The problem is, that the code to use that stuff will get horribly complicated pretty fast (alone getting an unsafe variable is non trivial).

第二个是使用 sun.misc.unsafe 包——顾名思义,它有一些相当明显的问题,但至少这几乎与使用的 VM 无关——要么支持(并且可以工作),要么不支持(和你得到链接错误)。问题是,使用这些东西的代码会很快变得非常复杂(单独获得一个不安全的变量是非常重要的)。

The third one would be to allocate a much, much, MUCH larger size than is actually needed, so that the object gets allocated in the old generation heap to begin:

第三个是分配比实际需要大很多很多很多的大小,以便对象在老年代堆中分配开始:

l-XX:PretenureSizeThreshold= that can be set to limit the size of allocations in the young generation. Any allocation larger than this will not be attempted in the young generation and so will be allocated out of the old generation.

l-XX:PretenureSizeThreshold= 可以设置限制年轻代分配的大小。任何大于此值的分配都不会在年轻代中尝试,因此将从年老代中分配。

Well the drawback of THAT solution is obvious I think (default size seems to be about 64kb).

好吧,我认为该解决方案的缺点很明显(默认大小似乎约为 64kb)。

. .

. .

Anyways here the old answer:

无论如何这里是旧答案:

Yep as I see it you pretty much cannot guarantee that the data stored on the heap is 100% removed without leaving a copy (that's even true if you don't want a general solution but one that'll work with say the current Hotspot VM and its default garbage collectors).

是的,正如我所见,您几乎无法保证在不保留副本的情况下 100% 删除存储在堆上的数据(如果您不想要通用解决方案,但可以使用的解决方案,例如当前的 Hotspot VM,这也是正确的)及其默认垃圾收集器)。

As said in your linked post (here), the garbage collector pretty much makes this impossible to guarantee. Actually contrary to what the post says the problem here isn't the generational GC, but the fact that the Hotspot VM (and now we're implementation specific) is using some kind of Stop & Copy gc for its young generation per default.

正如您在链接的帖子(这里)中所说,垃圾收集器几乎无法保证这一点。实际上与帖子所说的相反,这里的问题不是分代 GC,而是 Hotspot VM(现在我们是特定于实现的)默认情况下为其年轻代使用某种停止和复制 gc。

This means that as soon as a garbage collection happens between storing the password in the char array and zeroing it out you'll get a copy of the data that will be overwritten only as soon as the next GC happens. Note that tenuring an object will have exactly the same effect, but instead of copying it to to-space it's copied to the old generation heap - we end up with a copy of the data in from space that isn't overwritten.

这意味着一旦在将密码存储在字符数组中并将其清零之间发生垃圾收集,您将获得数据的副本,该副本仅在下一次 GC 发生时才会被覆盖。请注意,保留一个对象将具有完全相同的效果,但不是将其复制到空间,而是复制到老年代堆中 - 我们最终会得到未覆盖的空间中数据的副本。

To avoid this problem we'd pretty much need some way to guarantee that either NO garbage collection is happening between storing the password and zeroing it OR that the char array is stored from the get go in the old generation heap. Also note that this relies on the internas of the Hotspot VM which may very well change (actually there are different garbage collectors where many more copies can be generated; iirc the Hotspot VM supports a concurrent GC using a train algorithm). "luckily" it's impossible to guarantee either one of those (afaik every method call/return introduces a safe point!), so you don't even get tempted to try (especially considering that I don't see any way to make sure the JIT doesn't optimize the zeroing out away) ;)

为了避免这个问题,我们几乎需要某种方法来保证在存储密码和将密码归零之间不会发生垃圾收集,或者字符数组从 get go 存储在老年代堆中。还要注意,这依赖于 Hotspot VM 的内部结构,它可能会发生很大的变化(实际上有不同的垃圾收集器可以生成更多的副本;iirc Hotspot VM 支持使用训练算法的并发 GC)。“幸运的是”不可能保证其中任何一个(afaik 每个方法调用/返回都会引入一个安全点!),所以你甚至不想尝试(特别是考虑到我没有看到任何方法来确保JIT 不会优化归零) ;)

Seems like the only way to guarantee that the data is stored only in one location is to use the JNI for it.

似乎确保数据仅存储在一个位置的唯一方法是使用 JNI。

PS: Note that while the above is only true for the Heap, you can't guarantee anything more for the stack (the JIT will likely optimize writes without reads to the stack away, so when you return from the function the data will still be on the stack)

PS:请注意,虽然上述仅适用于堆,但您不能保证堆栈的更多内容(JIT 可能会优化写入而不读取堆栈,因此当您从函数返回时,数据仍将是在堆栈上)

回答by bmargulies

Tell your co-workers that this is a hopeless cause. What about the kernel socket buffers, just for a start.

告诉你的同事,这是一个毫无希望的事业。内核套接字缓冲区怎么样,只是一个开始。

If you cannot prevent unwanted programs from spying on memory on your machine, the passwords are compromised. Period.

如果您无法防止不需要的程序监视您机器上的内存,那么密码就会被泄露。时期。

回答by bwawok

Weird, never thought of anything like this.

奇怪,从来没想过这种事。

My first idea would be to make a char[100] to store your password in. Put that in there, use it for whatever, and then do a loop to set every char to blank.

我的第一个想法是制作一个字符 [100] 来存储你的密码。把它放在那里,用它来做任何事情,然后循环将每个字符设置为空白。

The problem is, the password would at some point turn into a String inside of the database driver, which could live in memory for 0 to infinity seconds.

问题是,密码在某些时候会变成数据库驱动程序内部的字符串,它可以在内存中存在 0 到无限秒。

My second idea would be to have all authentication done through some kind of JNI call to C, but that would be really hard if you are trying to use something like JDBC....

我的第二个想法是通过对 C 的某种 JNI 调用来完成所有身份验证,但是如果您尝试使用 JDBC 之类的东西,那将非常困难......

回答by teknopaul

Just aside but some of environments the java core security libs use char[] so it can be zeroed. I imagine that you don't get a guarantee tho.

放在一边,但在某些环境中,java 核心安全库使用 char[],因此可以将其归零。我想你没有得到保证。

回答by user2583872

I have been trying to work out some similar issues with credentials.

我一直在尝试解决一些与凭据类似的问题。

Until now, my only answer is "not to use strings at all for secrets". The strings are comfortable to use and store in human terms, but computers can work well with byte arrays. Even the encryption primitives work with byte[].

到现在为止,我唯一的答案是“根本不要将字符串用于机密”。字符串易于使用和以人的方式存储,但计算机可以很好地处理字节数组。甚至加密原语也适用于 byte[]。

When you don't need the password anymore, just fill the array with zeroes and don't let the GC to invent new ways to reuse your secrets.

当您不再需要密码时,只需用零填充数组,不要让 GC 发明新的方法来重用您的秘密。

In another thread (Why can't strings be mutable in Java and .NET?) they make an assumption that it is very short sight. That the strings are immutable because of security reasons; what was not devised is that not always the operational problems are the only ones in existence and that security sometimes need some flexibility and/or support to be effective, a support doesn't exist in the native Java.

在另一个线程中(为什么字符串在 Java 和 .NET 中不能是可变的?)他们假设它是非常短视的。由于安全原因,字符串是不可变的;没有想到的是,操作问题并不总是唯一存在的,安全性有时需要一些灵活性和/或支持才能有效,原生 Java 中不存在支持。

To complement. How could we read a password without using strings? Well ... be creative and don't use things like the Android EditText with input-type password, that just is not secure enough and requires you to go to strings.

来补充。我们如何在不使用字符串的情况下读取密码?嗯......要有创意,不要使用带有输入类型密码的Android EditText之类的东西,这不够安全,需要你去字符串。