java 为什么线程共享堆空间?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3318750/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 01:17:06  来源:igfitidea点击:

Why do threads share the heap space?

javamultithreadingconcurrencyheap

提问by Nayan Wadekar

Threads each have their own stack, but they share a common heap.

每个线程都有自己的堆栈,但它们共享一个公共堆。

Its clear to everyone that stack is for local/method variables & heap is for instance/class variables.

每个人都清楚堆栈用于本地/方法变量,而堆用于实例/类变量。

What is the benefit of sharing heap among threads.

在线程之间共享堆有什么好处。

There are several number of threads running simultaneously, so sharing memory can lead to issues such as concurrent modification, mutual exclusion etc overhead. What contents are shared by threads in heap.

有多个线程同时运行,因此共享内存会导致并发修改、互斥等开销等问题。堆中的线程共享哪些内容。

Why is this the case? Why not have each thread own its own heap as well? Can anyone provide a real world example of this, how shared memory is utilized by threads?

为什么会这样?为什么不让每个线程也拥有自己的堆?谁能提供一个真实的例子,线程如何使用共享内存?

回答by Gilles 'SO- stop being evil'

What do you do when you want to pass data from one thread to another? (If you never did that you'd be writing separate programs, not one multi-threaded program.) There are two major approaches:

当您想将数据从一个线程传递到另一个线程时,您会怎么做?(如果您从未这样做过,您将编写单独的程序,而不是一个多线程程序。)有两种主要方法:

  • The approach you seem to take for granted is shared memory: except for data that has a compelling reason to be thread-specific (such as the stack), all data is accessible to all threads. Basically, there is a shared heap. That gives you speed: any time a thread changes some data, other threads can see it. (Limitation: this is not true if the threads are executing on different processors: there the programmer needs to work especially hard to use shared memory correctly andefficiently.) Most major imperative languages, in particular Java and C#, favor this model.

    It is possible to have one heap per thread, plus a shared heap. This requires the programmer to decide which data to put where, and that often doesn't mesh well with existing programming languages.

  • The dual approach is message passing: each thread has its own data space; when a thread wants to communicate with another thread it needs to explicitly send a message to the other thread, so as to copy the data from the sender's heap to the recipient's heap. In this setting many communities prefer to call the threads processes. That gives you safety: since a thread can't overwrite some other thread's memory on a whim, a lot of bugs are avoided. Another benefit is distribution: you can make your threads run on separate machines without having to change a single line in your program. You can find message passing libraries for most languages but integration tends to be less good. Good languages to understand message passing in are Erlangand JoCaml.

    In fact message passing environments usually use shared memory behind the scene, at least as long as the threads are running on the same machine/processor. This saves a lot of time and memory since passing a message from one thread to another then doesn't require making a copy of the data. But since the shared memory is not exposed to the programmer, its inherent complexity is confined to the language/library implementation.

  • 您似乎认为理所当然的方法是共享内存:除了有令人信服的理由是线程特定的数据(例如堆栈)之外,所有数据都可以被所有线程访问。基本上,有一个共享堆。这为您提供了速度:每当一个线程更改某些数据时,其他线程都可以看到它。(限制:这个,如果线程在不同的处理器上执行的是不正确的:有程序员需要工作尤其难以正确地使用共享内存高效率。)大多数主要命令式语言,特别是Java和C#,青睐这种模式。

    每个线程可以有一个堆,外加一个共享堆。这需要程序员决定将哪些数据放在哪里,而这通常与现有的编程语言不太匹配。

  • 双重方法是消息传递:每个线程都有自己的数据空间;当一个线程想要与另一个线程通信时,它需要明确地向另一个线程发送消息,以便将数据从发送者的堆复制到接收者的堆。在这种情况下,许多社区更喜欢将线程称为进程。这给了你安全:因为一个线程不能随心所欲地覆盖其他线程的内存,所以避免了很多错误。另一个好处是分发:您可以让您的线程在不同的机器上运行,而无需更改程序中的一行。您可以找到大多数语言的消息传递库,但集成往往不太好。理解消息传递的好语言是ErlangJoCaml

    事实上,消息传递环境通常在幕后使用共享内存,至少只要线程在同一台机器/处理器上运行。这节省了大量时间和内存,因为将消息从一个线程传递到另一个线程,然后不需要复制数据。但由于共享内存不暴露给程序员,其固有的复杂性仅限于语言/库实现。

回答by user207421

Because otherwise they would be processes. That is the whole idea of threads, to share memory.

因为否则它们将是过程。这就是线程的全部思想,共享内存。

回答by S.Lott

Processes don't --generally-- share heap space. There are API's to permit this, but the default is that processes are separate

进程通常不会共享堆空间。有 API 允许这样做,但默认是进程是分开的

Threads share heap space.

线程共享堆空间。

That's the "practical idea" -- two ways to use memory -- shared and not shared.

这就是“实用的想法”——使用内存的两种方式——共享和不共享。

回答by Martin Ingvar Kofoed Jensen

In many languages/runtimes the stack is (among other) used for keep function/method parameters and variables. If thread shared a stack, things would get really messy.

在许多语言/运行时,堆栈(除其他外)用于保持函数/方法参数和变量。如果线程共享一个堆栈,事情会变得非常混乱。

void MyFunc(int a) // Stored on the stack
{
   int b; // Stored on the stack
}

When the call to 'MyFunc' is done, the stacked is popped and a and b is no longer on the stack. Because threads dont share stacks, there is no threading issue for the variables a and b.

当调用 'MyFunc' 完成时,堆栈被弹出并且 a 和 b 不再在堆栈上。由于线程不共享堆栈,因此变量 a 和 b 不存在线程问题。

Because of the nature of the stack (pushing/popping) its not really suited for keeping 'long term' state or shared state across function calls. Like this:

由于堆栈(推送/弹出)的性质,它并不真正适合在函数调用之间保持“长期”状态或共享状态。像这样:

int globalValue; // stored on the heap

void Foo() 
{
   int b = globalValue; // Gets the current value of globalValue

   globalValue = 10;
}

void Bar() // Stored on the stack
{
   int b = globalValue; // Gets the current value of globalValue

   globalValue = 20;
}


void main()
{
   globalValue = 0;
   Foo();
   // globalValue is now 10
   Bar();
   // globalValue is now 20
}

回答by Peter Lawrey

The problem is that having local heaps adds significant complexity for very little value.

问题是拥有本地堆会以很少的价值增加显着的复杂性。

There is a small performance advantage and this is handled well by the TLAB (Thread Local Allocation Buffer) which gives you most of the advantage transparently.

有一个小的性能优势,这由 TLAB(线程本地分配缓冲区)处理得很好,它可以透明地为您提供大部分优势。

回答by chetan

In a multi-threaded application each thread will have its own stack but will share the same heap. This is why care should be taken in your code to avoid any concurrent access issues in the heap space. The stack is threadsafe (each thread will have its own stack) but the heap is not threadsafe unless guarded with synchronisation through your code.

在多线程应用程序中,每个线程都有自己的堆栈,但将共享相同的堆。这就是为什么应该注意您的代码以避免堆空间中的任何并发访问问题。堆栈是线程安全的(每个线程都有自己的堆栈)但堆不是线程安全的,除非通过您的代码进行同步保护。

回答by Yuliy

The Heap is just all memory outside of the stack that is dynamically allocated. Since the OS provides a single address space then it becomes clear that the heap is by definition shared by all threads in the process. As for why stacks are not shared, that's because an execution thread has to have its own stack to be able to manage its call tree (it contains information about what to do when you leave a function, for instance!).

堆只是动态分配的堆栈之外的所有内存。由于操作系统提供单个地址空间,因此很明显堆是由进程中的所有线程共享的定义。至于为什么不共享堆栈,那是因为执行线程必须有自己的堆栈才能管理其调用树(例如,它包含有关离开函数时要做什么的信息!)。

Now you could of course write a memory manager that allocated data from different areas in your address space depending on the calling thread, but other threads would still be able to see that data (just like if you somehow leak a pointer to something on your thread's stack to another thread, that other thread could read it, despite this being a horrible idea)

现在您当然可以编写一个内存管理器,根据调用线程从地址空间的不同区域分配数据,但其他线程仍然能够看到该数据(就像如果您以某种方式泄漏指向线程堆栈到另一个线程,其他线程可以读取它,尽管这是一个可怕的想法)

回答by ninjalj

That's because the idea of threads is "share everything". Of course, there are some things you cannot share, like processor context and stack, but everything else is shared.

那是因为线程的想法是“共享一切”。当然,有些东西是不能共享的,比如处理器上下文和堆栈,但其他一切都是共享的。