multithreading 线程之间共享哪些资源?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1762418/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 01:05:12  来源:igfitidea点击:

What resources are shared between threads?

multithreadingprocessoperating-system

提问by Xinus

Recently, I have been asked a question in an interview what's the difference between a process and a thread. Really, I did not know the answer. I thought for a minute and gave a very weird answer.

最近,我在面试中被问到一个问题,进程和线程之间有什么区别。真的,我不知道答案。我想了一会儿,给出了一个很奇怪的答案。

Threads share the same memory, processes do not. After answering this, the interviewer gave me an evil smile and fired the following questions at me:

线程共享相同的内存,进程不共享。回答完后,面试官恶狠狠地笑了笑,向我抛出了以下问题:

Q. Do you know the segments in which a program gets divided?

:你知道一个节目被划分成哪些片段吗?

My answer: yep (thought it was an easy one) Stack, Data, Code, Heap

我的回答:是的(认为很简单)堆栈、数据、代码、堆

Q. So, tell me: which segments do threads share?

问:那么,告诉我:线程共享哪些段?

I could not answer this and ended up in saying all of them.

我无法回答这个问题,最后把它们都说出来​​了。

Please, can anybody present the correct and impressive answers for the difference between a process and a thread?

拜托,有人可以为进程和线程之间的区别提供正确且令人印象深刻的答案吗?

采纳答案by Greg Hewgill

You're pretty much correct, but threads share all segments exceptthe stack. Threads have independent call stacks, however the memory in other thread stacks is still accessible and in theory you could hold a pointer to memory in some other thread's local stack frame (though you probably should find a better place to put that memory!).

您几乎是正确的,但是线程共享堆栈之外的所有段。线程具有独立的调用堆栈,但是其他线程堆栈中的内存仍然可以访问,理论上您可以在其他线程的本地堆栈帧中保存指向内存的指针(尽管您可能应该找到一个更好的地方来放置该内存!)。

回答by Jorge Córdoba

From Wikipedia(I think that would make a really good answer for the interviewer :P)

来自维基百科(我认为这对面试官来说是一个很好的答案:P)

Threads differ from traditional multitasking operating system processes in that:

  • processes are typically independent, while threads exist as subsets of a process
  • processes carry considerable state information, whereas multiple threads within a process share state as well as memory and other resources
  • processes have separate address spaces, whereas threads share their address space
  • processes interact only through system-provided inter-process communication mechanisms.
  • Context switching between threads in the same process is typically faster than context switching between processes.

线程与传统的多任务操作系统进程的不同之处在于:

  • 进程通常是独立的,而线程作为进程的子集存在
  • 进程携带大量状态信息,而进程内的多个线程共享状态以及内存和其他资源
  • 进程有独立的地址空间,而线程共享它们的地址空间
  • 进程仅通过系统提供的进程间通信机制进行交互。
  • 同一进程中线程之间的上下文切换通常比进程之间的上下文切换快。

回答by Robert S. Barnes

Something that really needs to be pointed out is that there are really two aspects to this question - the theoretical aspect and the implementations aspect.

真正需要指出的是,这个问题实际上有两个方面 - 理论方面和实现方面。

First, let's look at the theoretical aspect. You need to understand what a process is conceptually to understand the difference between a process and a thread and what's shared between them.

首先,让我们看一下理论方面。您需要从概念上了解进程是什么,以了解进程和线程之间的区别以及它们之间共享的内容。

We have the following from section 2.2.2 The Classical Thread Modelin Modern Operating Systems 3eby Tanenbaum:

我们在Tanenbaum 的第2.2.2现代操作系统 3e 中的经典线程模型中有以下内容:

The process model is based on two independent concepts: resource grouping and execution. Sometimes it is use-ful to separate them; this is where threads come in....

流程模型基于两个独立的概念:资源分组和执行。有时将它们分开是有用的;这就是线程进来的地方......

He continues:

他继续:

One way of looking at a process is that it is a way to group related resources together. A process has an address space containing program text and data, as well as other resources. These resource may include open files, child processes, pending alarms, signal handlers, accounting information, and more. By putting them together in the form of a process, they can be managed more easily. The other concept a process has is a thread of execution, usually shortened to just thread. The thread has a program counter that keeps track of which instruc-tion to execute next. It has registers, which hold its current working variables. It has a stack, which contains the execution history, with one frame for each proce-dure called but not yet returned from. Although a thread must execute in some process, the thread and its process are different concepts and can be treated sepa-rately. Processes are used to group resources together; threads are the entities scheduled for execution on the CPU.

查看流程的一种方式是,它是一种将相关资源组合在一起的方式。进程具有包含程序文本和数据以及其他资源的地址空间。这些资源可能包括打开的文件、子进程、挂起的警报、信号处理程序、记帐信息等。通过以流程的形式将它们组合在一起,可以更轻松地对其进行管理。进程的另一个概念是执行线程,通常简称为线程。该线程有一个程序计数器,用于跟踪下一个要执行的指令。它有寄存器,用于保存当前的工作变量。它有一个堆栈,其中包含执行历史记录,每个过程调用但尚未返回一个帧。虽然一个线程必须在某个进程中执行,线程和它的进程是不同的概念,可以分开处理。进程用于将资源组合在一起;线程是安排在 CPU 上执行的实体。

Further down he provides the following table:

再往下,他提供了下表:

Per process items             | Per thread items
------------------------------|-----------------
Address space                 | Program counter
Global variables              | Registers
Open files                    | Stack
Child processes               | State
Pending alarms                |
Signals and signal handlers   |
Accounting information        |

The above is what you need for threads to work. As others have pointed out, things like segments are OS dependant implementation details.

以上是线程工作所需的内容。正如其他人指出的那样,段之类的东西是依赖于操作系统的实现细节。

回答by Alex Budovski

Tell the interviewer that it depends entirely on the implementation of the OS.

告诉面试官,这完全取决于操作系统的实现。

Take Windows x86 for example. There are only 2segments [1], Code and Data. And they're both mapped to the whole 2GB (linear, user) address space. Base=0, Limit=2GB. They would've made one but x86 doesn't allow a segment to be both Read/Write and Execute. So they made two, and set CS to point to the code descriptor, and the rest (DS, ES, SS, etc) to point to the other [2]. But both point to the same stuff!

以 Windows x86 为例。只有2 个段 [1],代码和数据。它们都映射到整个 2GB(线性、用户)地址空间。基数=0,限制=2GB。他们本来可以做一个,但 x86 不允许一个段既是读/写又是执行。所以他们做了两个,并设置 CS 指向代码描述符,其余(DS、ES、SS 等)指向另一个 [2]。但两者都指向同一个东西!

The person interviewing you had made a hidden assumption that he/she did not state, and that is a stupid trick to pull.

面试你的人做了一个隐藏的假设,他/她没有说出来,这是一个愚蠢的伎俩。

So regarding

所以关于

Q. So tell me which segment thread share?

问:那么告诉我哪个段线程共享?

The segments are irrelevant to the question, at least on Windows. Threads share the whole address space. There is only 1 stack segment, SS, and it points to the exact same stuff that DS, ES, and CS do [2]. I.e. the whole bloody user space. 0-2GB. Of course, that doesn't mean threads only have 1 stack. Naturally each has its own stack, but x86 segments are not used for this purpose.

这些段与问题无关,至少在 Windows 上是这样。线程共享整个地址空间。只有 1 个堆栈段 SS,它指向与 DS、ES 和 CS 完全相同的内容 [2]。即整个该死的用户空间。0-2GB。当然,这并不意味着线程只有 1 个堆栈。自然每个都有自己的堆栈,但 x86 段不用于此目的。

Maybe *nix does something different. Who knows. The premise the question was based on was broken.

也许 *nix 做了一些不同的事情。谁知道。这个问题所基于的前提被打破了。



  1. At least for user space.
  2. From ntsd notepad: cs=001b ss=0023 ds=0023 es=0023
  1. 至少对于用户空间而言。
  2. 来自ntsd notepadcs=001b ss=0023 ds=0023 es=0023

回答by Nimish Thakkar

Generally, Threads are called light weight process. If we divide memory into three sections then it will be: Code, data and Stack. Every process has its own code, data and stack sections and due to this context switch time is a little high. To reduce context switching time, people have come up with concept of thread, which shares Data and code segment with other thread/process and it has its own STACK segment.

通常,线程被称为轻量级进程。如果我们将内存分为三个部分,那么它将是:代码、数据和堆栈。每个进程都有自己的代码、数据和堆栈部分,因此上下文切换时间有点长。为了减少上下文切换时间,人们提出了线程的概念,它与其他线程/进程共享数据和代码段,并拥有自己的堆栈段。

回答by Dhirendra Vikash Sharma

A process has code, data, heap and stack segments. Now, the Instruction Pointer (IP) of a thread OR threads points to the code segment of the process. The data and heap segments are shared by all the threads. Now what about the stack area? What is actually the stack area? Its an area created by the process just for its thread to use... because stacks can be used in a much faster way than heaps etc. The stack area of the process is divided among threads, i.e. if there are 3 threads, then the stack area of the process is divided into 3 parts and each is given to the 3 threads. In other words, when we say that each thread has its own stack, that stack is actually a part of the process stack area allocated to each thread. When a thread finishes its execution, the stack of the thread is reclaimed by the process. In fact, not only the stack of a process is divided among threads, but all the set of registers that a thread uses like SP, PC and state registers are the registers of the process. So when it comes to sharing, the code, data and heap areas are shared, while the stack area is just divided among threads.

进程具有代码段、数据段、堆段和堆栈段。现在,线程或线程的指令指针 (IP) 指向进程的代码段。数据和堆段由所有线程共享。现在堆栈区呢?什么是栈区?它是由进程创建的一个区域,仅供其线程使用......因为堆栈可以比堆等更快的方式使用。进程的堆栈区域在线程之间划分,即如果有 3 个线程,则进程的栈区分为3个部分,每个部分都分配给3个线程。换句话说,当我们说每个线程都有自己的堆栈时,该堆栈实际上是分配给每个线程的进程堆栈区域的一部分。当一个线程完成它的执行时,该线程的堆栈被进程回收。实际上,不仅进程的堆栈被线程划分,而且线程使用的所有寄存器集如SP、PC和状态寄存器都是进程的寄存器。所以说到共享时,代码区、数据区和堆区是共享的,而栈区只是在线程之间进行划分。

回答by Kevin Peterson

Threads share the code and data segments and the heap, but they don't share the stack.

线程共享代码和数据段以及堆,但它们不共享堆栈。

回答by Daniel Brückner

Threads share data and code while processes do not. The stack is not shared for both.

线程共享数据和代码,而进程不共享。堆栈不为两者共享。

Processes can also share memory, more precisely code, for example after a Fork(), but this is an implementation detail and (operating system) optimization. Code shared by multiple processes will (hopefully) become duplicated on the first write to the code - this is known as copy-on-write. I am not sure about the exact semantics for the code of threads, but I assume shared code.

进程也可以共享内存,更准确地说是代码,例如在 a 之后Fork(),但这是一个实现细节和(操作系统)优化。由多个进程共享的代码将(希望)在第一次写入代码时复制- 这称为copy-on-write。我不确定线程​​代码的确切语义,但我假设共享代码。

           Process   Thread

   Stack   private   private
   Data    private   shared
   Code    private1  shared2

1The code is logicallyprivate but might be shared for performance reasons. 2I am not 100% sure.

1代码在逻辑上是私有的,但出于性能原因可能会共享。 2我不是 100% 确定。

回答by Useless

Threads share everything[1]. There is one address space for the whole process.

线程共享一切[1]。整个过程只有一个地址空间。

Each thread has its own stack and registers, but all threads' stacks are visible in the shared address space.

每个线程都有自己的堆栈和寄存器,但所有线程的堆栈在共享地址空间中都是可见的。

If one thread allocates some object on its stack, and sends the address to another thread, they'll both have equal access to that object.

如果一个线程在其堆栈上分配了某个对象,并将该地址发送到另一个线程,则它们都可以平等地访问该对象。



Actually, I just noticed a broader issue: I think you're confusing two uses of the word segment.

实际上,我刚刚注意到一个更广泛的问题:我认为您混淆了词段的两种用法。

The file format for an executable (eg, ELF) has distinct sections in it, which may be referred to as segments, containing compiled code (text), initialized data, linker symbols, debug info, etc. There are no heap or stack segments here, since those are runtime-only constructs.

可执行文件(例如,ELF)的文件格式中有不同的部分,可以称为段,包含编译代码(文本)、初始化数据、链接器符号、调试信息等。没有堆或堆栈段在这里,因为这些是仅限运行时的构造。

These binary file segments may be mapped into the process address space seperately, with different permissions (eg, read-only executable for code/text, and copy-on-write non-executable for initialized data).

这些二进制文件段可以分别映射到进程地址空间中,具有不同的权限(例如,代码/文本的只读可执行文件,以及初始化数据的写时复制不可执行的)。

Areas of this address space are used for different purposes, like heap allocation and thread stacks, by convention (enforced by your language runtime libraries). It is all just memory though, and probably not segmented unless you're running in virtual 8086 mode. Each thread's stack is a chunk of memory allocated at thread creation time, with the current stack top address stored in a stack pointer register, and each thread keeps its own stack pointer along with its other registers.

根据约定(由您的语言运行时库强制执行),此地址空间的区域用于不同目的,例如堆分配和线程堆栈。不过,这只是内存,除非您在虚拟 8086 模式下运行,否则可能不会分段。每个线程的栈是在线程创建时分配的一块内存,当前栈顶地址存储在一个栈指针寄存器中,每个线程保存自己的栈指针和其他寄存器。



[1] OK, I know: signal masks, TSS/TSD etc. The address space, including all its mapped program segments, are still shared though.

[1] 好的,我知道:信号掩码、TSS/TSD 等。地址空间,包括其所有映射的程序段,仍然是共享的。

回答by George

In an x86 framework, one can divide as many segments (up to 2^16-1). The ASM directives SEGMENT/ENDS allows this, and the operators SEG and OFFSET allows initialization of segment registers. CS:IP are usually initialized by the loader, but for DS, ES, SS the application is responsible with initialization. Many environments allow the so-called "simplified segment definitions" like .code, .data, .bss, .stack etc. and, depending also on the "memory model" (small, large, compact etc.) the loader initializes segment registers accordingly. Usually .data, .bss, .stack and other usual segments (I haven't done this since 20 years so I don't remember all) are grouped in one single group - that is why usually DS, ES and SS points to teh same area, but this is only to simplify things.

在 x86 框架中,可以划分尽可能多的段(最多 2^16-1)。ASM 指令 SEGMENT/ENDS 允许这样做,操作符 SEG 和 OFFSET 允许初始化段寄存器。CS:IP 通常由加载程序初始化,但对于 DS、ES、SS,应用程序负责初始化。许多环境允许所谓的“简化段定义”,如 .code、.data、.bss、.stack 等,并且还取决于“内存模型”(小、大、紧凑等)加载程序初始化段寄存器因此。通常 .data、.bss、.stack 和其他常用段(我 20 年以来没有这样做过,所以我不记得所有)被归为一组 - 这就是为什么通常 DS、ES 和 SS 指向 teh相同的区域,但这只是为了简化事情。

In general, all segment registers can have different values upon run-time. So, the interview question was right: which one of the CODE, DATA, and STACK are shared between threads. Heap management is something else - it is simply a sequence of calls to the OS. But what if you don't have an OS at all, like in an embedded system - can you still have new/delete in your code?

通常,所有段寄存器在运行时都可以具有不同的值。所以,面试问题是正确的:CODE、DATA 和 STACK 中的哪一个在线程之间共享。堆管理是另一回事——它只是对操作系统的一系列调用。但是,如果您根本没有操作系统(例如在嵌入式系统中)怎么办 - 您的代码中还可以有新的/删除的吗?

My advice to the young people - read some good assembly programming book. It seems that university curriculae are quite poor in this respect.

我对年轻人的建议 - 阅读一些好的汇编编程书。在这方面,大学课程似乎很差。