在 C# 中,为什么 String 是一种行为类似于值类型的引用类型?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/636932/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 11:13:36  来源:igfitidea点击:

In C#, why is String a reference type that behaves like a value type?

c#stringclrvalue-typereference-type

提问by Davy8

A String is a reference type even though it has most of the characteristics of a value type such as being immutable and having == overloaded to compare the text rather than making sure they reference the same object.

String 是一种引用类型,尽管它具有值类型的大部分特征,例如不可变和 == 重载以比较文本而不是确保它们引用相同的对象。

Why isn't string just a value type then?

为什么字符串不只是一个值类型呢?

采纳答案by codekaizen

Strings aren't value types since they can be huge, and need to be stored on the heap. Value types are (in all implementations of the CLR as of yet) stored on the stack. Stack allocating strings would break all sorts of things: the stack is only 1MB for 32-bit and 4MB for 64-bit, you'd have to box each string, incurring a copy penalty, you couldn't intern strings, and memory usage would balloon, etc...

字符串不是值类型,因为它们可能很大,并且需要存储在堆中。值类型(在迄今为止的所有 CLR 实现中)存储在堆栈中。堆栈分配字符串会破坏各种事情:32 位的堆栈只有 1MB,64 位的堆栈只有 4MB,您必须将每个字符串装箱,导致复制损失,您不能实习字符串和内存使用会气球等...

(Edit: Added clarification about value type storage being an implementation detail, which leads to this situation where we have a type with value sematics not inheriting from System.ValueType. Thanks Ben.)

(编辑:添加了关于值类型存储是一个实现细节的说明,这导致了我们有一个类型的值语义不是从 System.ValueType 继承的情况。谢谢 Ben。)

回答by jason

It is not a value type because performance (space and time!) would be terrible if it were a value type and its value had to be copied every time it were passed to and returned from methods, etc.

它不是值类型,因为如果它是值类型并且每次传递和从方法返回它的值时都必须复制它的值,那么性能(空间和时间!)会很糟糕,等等。

It has value semantics to keep the world sane. Can you imagine how difficult it would be to code if

它具有保持世界理智的价值语义。你能想象编码会有多困难,如果

string s = "hello";
string t = "hello";
bool b = (s == t);

set bto be false? Imagine how difficult coding just about any application would be.

设置bfalse?想象一下,对任何应用程序进行编码是多么困难。

回答by Chris

Also, the way strings are implemented (different for each platform) and when you start stitching them together. Like using a StringBuilder. It allocats a buffer for you to copy into, once you reach the end, it allocates even more memory for you, in the hopes that if you do a large concatenation performance won't be hindered.

此外,字符串的实现方式(每个平台都不同)以及何时开始将它们拼接在一起。就像使用StringBuilder. 它会为你分配一个缓冲区供你复制,一旦你到达最后,它会为你分配更多的内存,希望如果你做一个大的串联性能不会受到影响。

Maybe Jon Skeet can help up out here?

也许 Jon Skeet 可以帮上忙?

回答by WebMatrix

Actually strings have very few resemblances to value types. For starters, not all value types are immutable, you can change the value of an Int32 all you want and it it would still be the same address on the stack.

实际上,字符串与值类型几乎没有相似之处。对于初学者来说,并非所有值类型都是不可变的,您可以随心所欲地更改 Int32 的值,它仍然是堆栈中的相同地址。

Strings are immutable for a very good reason, it has nothing to do with it being a reference type, but has a lot to do with memory management. It's just more efficient to create a new object when string size changes than to shift things around on the managed heap. I think you're mixing together value/reference types and immutable objects concepts.

字符串是不可变的,这是有充分理由的,它与引用类型无关,但与内存管理有很大关系。当字符串大小改变时创建一个新对象比在托管堆上移动更有效。我认为您将值/引用类型和不可变对象概念混合在一起。

As far as "==": Like you said "==" is an operator overload, and again it was implemented for a very good reason to make framework more useful when working with strings.

至于“==”:就像你说的“==”是一个运算符重载,再次实现它是为了使框架在处理字符串时更有用。

回答by Denis Troller

It is mainly a performance issue.

主要是性能问题。

Having strings behave LIKE value type helps when writing code, but having it BE a value type would make a huge performance hit.

在编写代码时,让字符串表现 LIKE 值类型会有所帮助,但让它成为值类型会产生巨大的性能损失。

For an in-depth look, take a peek at a nice articleon strings in the .net framework.

要深入了解,请看一篇关于 .net 框架中字符串的好文章

回答by Denis Troller

How can you tell stringis a reference type? I'm not sure that it matters how it is implemented. Strings in C# are immutable precisely so that you don't have to worry about this issue.

你怎么知道string是引用类型?我不确定它的实施方式是否重要。C# 中的字符串恰好是不可变的,因此您不必担心这个问题。

回答by Bogdan_Ch

Not only strings are immutable reference types. Multi-cast delegates too.That is why it is safe to write

不仅字符串是不可变的引用类型。 多播代表也是如此。这就是为什么写是安全的

protected void OnMyEventHandler()
{
     delegate handler = this.MyEventHandler;
     if (null != handler)
     {
        handler(this, new EventArgs());
     }
}

I suppose that strings are immutable because this is the most safe method to work with them and allocate memory. Why they are not Value types? Previous authors are right about stack size etc. I would also add that making strings a reference types allow to save on assembly size when you use the same constant string in the program. If you define

我认为字符串是不可变的,因为这是使用它们并分配内存的最安全的方法。为什么它们不是值类型?以前的作者在堆栈大小等方面是正确的。我还要补充一点,当您在程序中使用相同的常量字符串时,将字符串作为引用类型可以节省程序集大小。如果你定义

string s1 = "my string";
//some code here
string s2 = "my string";

Chances are that both instances of "my string" constant will be allocated in your assembly only once.

有可能“我的字符串”常量的两个实例将仅在您的程序集中分配一次。

If you would like to manage strings like usual reference type, put the string inside a new StringBuilder(string s). Or use MemoryStreams.

如果您想像通常的引用类型一样管理字符串,请将字符串放入新的 StringBuilder(string s) 中。或者使用 MemoryStreams。

If you are to create a library, where you expect a huge strings to be passed in your functions, either define a parameter as a StringBuilder or as a Stream.

如果您要创建一个库,您希望在您的函数中传递一个巨大的字符串,请将参数定义为 StringBuilder 或 Stream。

回答by BionicCyborg

Isn't just as simple as Strings are made up of characters arrays. I look at strings as character arrays[]. Therefore they are on the heap because the reference memory location is stored on the stack and points to the beginning of the array's memory location on the heap. The string size is not known before it is allocated ...perfect for the heap.

不仅仅是字符串由字符数组组成那么简单。我将字符串视为字符数组[]。因此它们在堆上,因为引用内存位置存储在堆栈上并指向数组在堆上的内存位置的开头。字符串大小在分配之前是未知的......非常适合堆。

That is why a string is really immutable because when you change it even if it is of the same size the compiler doesn't know that and has to allocate a new array and assign characters to the positions in the array. It makes sense if you think of strings as a way that languages protect you from having to allocate memory on the fly (read C like programming)

这就是字符串真正不可变的原因,因为当您更改它时,即使它的大小相同,编译器也不知道这一点,并且必须分配一个新数组并将字符分配给数组中的位置。如果您将字符串视为语言保护您免于动态分配内存的一种方式(像编程一样阅读 C),这是有道理的

回答by JacquesB

The distinction between reference types and value types are basically a performance tradeoff in the design of the language. Reference types have some overhead on construction and destruction and garbage collection, because they are created on the heap. Value types on the other hand have overhead on method calls (if the data size is larger than a pointer), because the whole object is copied rather than just a pointer. Because strings can be (and typically are) much larger than the size of a pointer, they are designed as reference types. Also, as Servy pointed out, the size of a value type must be known at compile time, which is not always the case for strings.

引用类型和值类型之间的区别基本上是语言设计中的性能权衡。引用类型在构造、销毁和垃圾收集方面有一些开销,因为它们是在堆上创建的。另一方面,值类型在方法调用上有开销(如果数据大小大于指针),因为整个对象被复制而不仅仅是一个指针。因为字符串可以(并且通常)比指针的大小大得多,所以它们被设计为引用类型。此外,正如 Servy 所指出的,值类型的大小必须在编译时知道,但对于字符串而言,情况并非总是如此。

The question of mutability is a separate issue. Both reference types and value types can be either mutable or immutable. Value types are typically immutable though, since the semantics for mutable value types can be confusing.

可变性问题是一个单独的问题。引用类型和值类型都可以是可变的或不可变的。但是,值类型通常是不可变的,因为可变值类型的语义可能会令人困惑。

Reference types are generally mutable, but can be designed as immutable if it makes sense. Strings are defined as immutable because it makes certain optimizations possible. For example, if the same string literal occurs multiple times in the same program (which is quite common), the compiler can reuse the same object.

引用类型通常是可变的,但如果有意义,也可以设计为不可变的。字符串被定义为不可变的,因为它使某些优化成为可能。例如,如果同一个字符串在同一个程序中多次出现(这很常见),编译器可以重用同一个对象。

So why is "==" overloaded to compare strings by text? Because it is the most useful semantics. If two strings are equal by text, they may or may not be the same object reference due to the optimizations. So comparing references are pretty useless, while comparing text are almost always what you want.

那么为什么要重载“==”来按文本比较字符串呢?因为它是最有用的语义。如果两个字符串在文本上相等,则由于优化,它们可能是也可能不是相同的对象引用。所以比较参考文献是没有用的,而比较文本几乎总是你想要的。

Speaking more generally, Strings has what is termed value semantics. This is a more general concept than value types, which is a C# specific implementation detail. Value types have value semantics, but reference types may also have value semantics. When a type have value semantics, you can't really tell if the underlying implementation is a reference type or value type, so you can consider that an implementation detail.

更一般地说,字符串具有所谓的值语义。这是一个比值类型更通用的概念,它是 C# 特定的实现细节。值类型具有值语义,但引用类型也可能具有值语义。当类型具有值语义时,您无法真正判断底层实现是引用类型还是值类型,因此您可以将其视为实现细节。

回答by jinzai

At the risk of getting yet another mysterious down-vote...the fact that many mention the stack and memory with respect to value types and primitive types is because they must fit into a register in the microprocessor. You cannot push or pop something to/from the stack if it takes more bits than a register has....the instructions are, for example "pop eax" -- because eax is 32 bits wide on a 32-bit system.

冒着又一次神秘的否决票的风险……许多人提到关于值类型和原始类型的堆栈和内存的事实是因为它们必须适合微处理器中的寄存器。如果需要的位比寄存器多,则您不能将某些内容推入或从堆栈中弹出....指令是,例如“pop eax”——因为 eax 在 32 位系统上是 32 位宽。

Floating-point primitive types are handled by the FPU, which is 80 bits wide.

浮点原始类型由 80 位宽的 FPU 处理。

This was all decided long before there was an OOP language to obfuscate the definition of primitive type and I assume that value type is a term that has been created specifically for OOP languages.

这一切早在有一种 OOP 语言来混淆原始类型的定义之前就已经决定了,我认为值类型是一个专门为 OOP 语言创建的术语。