什么时候在 Ruby 中使用符号而不是字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16621073/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 05:57:58  来源:igfitidea点击:

When to use symbols instead of strings in Ruby?

rubysymbols

提问by Alan Coromano

If there are at least two instances of the same string in my script, should I instead use a symbol?

如果我的脚本中至少有两个相同字符串的实例,我应该使用符号吗?

回答by fotanus

TL;DR

TL; 博士

A simple rule of thumb is to use symbols every time you need internal identifiers. For Ruby < 2.2 only use symbols when they aren't generated dynamically, to avoid memory leaks.

一个简单的经验法则是每次需要内部标识符时都使用符号。对于 Ruby < 2.2,只有在它们不是动态生成时才使用符号,以避免内存泄漏。

Full answer

完整答案

The only reason not to use them for identifiers that are generated dynamically is because of memory concerns.

不将它们用于动态生成的标识符的唯一原因是内存问题。

This question is very common because many programming languages don't have symbols, only strings, and thus strings are also used as identifiers in your code. You should be worrying about what symbols are meant to be, not only when you should use symbols. Symbols are meant to be identifiers. If you follow this philosophy, chances are that you will do things right.

这个问题很常见,因为许多编程语言没有符号,只有字符串,因此字符串也用作代码中的标识符。您应该担心符号是什么意思,而不仅仅是什么时候应该使用符号。符号旨在作为标识符。如果您遵循这一理念,那么您很有可能会做对。

There are several differences between the implementation of symbols and strings. The most important thing about symbols is that they are immutable. This means that they will never have their value changed. Because of this, symbols are instantiated faster than strings and some operations like comparing two symbols is also faster.

符号和字符串的实现之间有几个区别。符号最重要的一点是它们是不可变的。这意味着它们的价值永远不会改变。正因为如此,符号的实例化速度比字符串快,并且一些操作(例如比较两个符号)也更快。

The fact that a symbol is immutable allows Ruby to use the same object every time you reference the symbol, saving memory. So every time the interpreter reads :my_keyit can take it from memory instead of instantiate it again. This is less expensive than initializing a new string every time.

符号是不可变的这一事实允许 Ruby 每次引用该符号时都使用相同的对象,从而节省内存。所以每次解释器读取:my_key它时都可以从内存中取出它而不是再次实例化它。这比每次都初始化一个新字符串要便宜。

You can get a list all symbols that are already instantiated with the command Symbol.all_symbols:

您可以获取已使用以下命令实例化的所有符号的列表Symbol.all_symbols

symbols_count = Symbol.all_symbols.count # all_symbols is an array with all 
                                         # instantiated symbols. 
a = :one
puts a.object_id
# prints 167778 

a = :two
puts a.object_id
# prints 167858

a = :one
puts a.object_id
# prints 167778 again - the same object_id from the first time!

puts Symbol.all_symbols.count - symbols_count
# prints 2, the two objects we created.

For Ruby versions before 2.2, once a symbol is instantiated, this memory will never be free again. The only way to free the memory is restarting the application. So symbols are also a major cause of memory leaks when used incorrectly. The simplest way to generate a memory leak is using the method to_symon user input data, since this data will always change, a new portion of the memory will be used forever in the software instance. Ruby 2.2 introduced the symbol garbage collector, which frees symbols generated dynamically, so the memory leaks generated by creating symbols dynamically it is not a concern any longer.

对于 2.2 之前的 Ruby 版本,一旦符号被实例化,该内存将永远不会再次空闲。释放内存的唯一方法是重新启动应用程序。因此,符号使用不当也是导致内存泄漏的主要原因。产生内存泄漏的最简单方法是使用to_sym用户输入数据的方法,因为这些数据总是会改变,新的内存部分将在软件实例中永久使用。Ruby 2.2 引入了符号垃圾收集器,它可以释放动态生成的符号,因此动态创建符号所产生的内存泄漏不再是问题。

Answering your question:

回答你的问题:

Is it true I have to use a symbol instead of a string if there is at least two the same strings in my application or script?

如果我的应用程序或脚本中至少有两个相同的字符串,我是否必须使用符号而不是字符串?

If what you are looking for is an identifier to be used internally at your code, you should be using symbols. If you are printing output, you should go with strings, even if it appears more than once, even allocating two different objects in memory.

如果您要查找的是要在代码内部使用的标识符,则应该使用符号。如果您正在打印输出,您应该使用字符串,即使它出现不止一次,甚至在内存中分配两个不同的对象。

Here's the reasoning:

这是推理:

  1. Printing the symbols will be slower than printing strings because they are cast to strings.
  2. Having lots of different symbols will increase the overall memory usage of your application since they are never deallocated. And you are never using all strings from your code at the same time.
  1. 打印符号会比打印字符串慢,因为它们被转换为字符串。
  2. 拥有大量不同的符号会增加应用程序的整体内存使用量,因为它们永远不会被释放。而且您永远不会同时使用代码中的所有字符串。

Use case by @AlanDert

@AlanDert 的用例

@AlanDert: if I use many times something like %input{type: :checkbox} in haml code, what should I use as checkbox?

Me: Yes.

@AlanDert: But to print out a symbol on html page, it should be converted to string, shouldn't it? what's the point of using it then?

@AlanDert:如果我在 haml 代码中多次使用 %input{type::checkbox} 之类的东西,我应该使用什么作为复选框?

我可以。

@AlanDert:但是要在 html 页面上打印一个符号,它应该转换为字符串,不是吗?那么使用它有什么意义呢?

What is the type of an input? An identifier of the type of input you want to use or something you want to show to the user?

输入的类型是什么?您要使用的输入类型的标识符或要向用户显示的内容?

It is true that it will become HTML code at some point, but at the moment you are writing that line of your code, it is mean to be an identifier - it identifies what kind of input field you need. Thus, it is used over and over again in your code, and have always the same "string" of characters as the identifier and won't generate a memory leak.

确实,它会在某个时候变成 HTML 代码,但是在您编写这行代码的那一刻,它意味着成为一个标识符 - 它标识您需要什么样的输入字段。因此,它会在您的代码中反复使用,并且始终具有与标识符相同的字符“字符串”,并且不会产生内存泄漏。

That said, why don't we evaluate the data to see if strings are faster?

也就是说,我们为什么不评估数据以查看字符串是否更快?

This is a simple benchmark I created for this:

这是我为此创建的一个简单基准:

require 'benchmark'
require 'haml'

str = Benchmark.measure do
  10_000.times do
    Haml::Engine.new('%input{type: "checkbox"}').render
  end
end.total

sym = Benchmark.measure do
  10_000.times do
    Haml::Engine.new('%input{type: :checkbox}').render
  end
end.total

puts "String: " + str.to_s
puts "Symbol: " + sym.to_s

Three outputs:

三个输出:

# first time
String: 5.14
Symbol: 5.07
#second
String: 5.29
Symbol: 5.050000000000001
#third
String: 4.7700000000000005
Symbol: 4.68

So using smbols is actually a bit faster than using strings. Why is that? It depends on the way HAML is implemented. I would need to hack a bit on HAML code to see, but if you keep using symbols in the concept of an identifier, your application will be faster and reliable. When questions strike, benchmark it and get your answers.

所以使用 smbols 实际上比使用字符串要快一些。这是为什么?这取决于 HAML 的实现方式。我需要对 HAML 代码进行一些修改才能看到,但是如果您继续在标识符的概念中使用符号,您的应用程序将更快更可靠。当问题出现时,对其进行基准测试并获得答案。

回答by Boris Stitnicky

Put simply, a symbol is a name, composed of characters, but immutable. A string, on the contrary, is an ordered container for characters, whose contents are allowed to change.

简单地说,符号是一个名称,由字符组成,但不可变。相反,字符串是字符的有序容器,其内容可以更改。

回答by Arun Kumar M

  1. A Ruby symbol is an object with O(1) comparison
  1. Ruby 符号是具有 O(1) 比较的对象

To compare two strings, we potentially need to look at every character. For two strings of length N, this will require N+1 comparisons (which computer scientists refer to as "O(N) time").

要比较两个字符串,我们可能需要查看每个字符。对于长度为 N 的两个字符串,这将需要 N+1 次比较(计算机科学家将其称为“O(N) 时间”)。

def string_comp str1, str2
  return false if str1.length != str2.length
  for i in 0...str1.length
    return false if str1[i] != str2[i]
  end
  return true
end
string_comp "foo", "foo"

But since every appearance of :foo refers to the same object, we can compare symbols by looking at object IDs. We can do this with a single comparison (which computer scientists refer to as "O(1) time").

但是由于 :foo 的每次出现都指向同一个对象,我们可以通过查看对象 ID 来比较符号。我们可以通过一次比较来做到这一点(计算机科学家称之为“O(1) 时间”)。

def symbol_comp sym1, sym2
  sym1.object_id == sym2.object_id
end
symbol_comp :foo, :foo
  1. A Ruby symbol is a label in a free-form enumeration
  1. Ruby 符号是自由格式枚举中的标签

In C++, we can use "enumerations" to represent families of related constants:

在 C++ 中,我们可以使用“枚举”来表示相关常量族:

enum BugStatus { OPEN, CLOSED };
BugStatus original_status = OPEN;
BugStatus current_status  = CLOSED;

But because Ruby is a dynamic language, we don't worry about declaring a BugStatus type, or keeping track of the legal values. Instead, we represent the enumeration values as symbols:

但是因为 Ruby 是一种动态语言,所以我们不用担心声明 BugStatus 类型或跟踪合法值。相反,我们将枚举值表示为符号:

original_status = :open
current_status  = :closed

3.A Ruby symbol is a constant, unique name

3. Ruby 符号是一个不变的、唯一的名称

In Ruby, we can change the contents of a string:

在 Ruby 中,我们可以更改字符串的内容:

"foo"[0] = ?b # "boo"

But we can't change the contents of a symbol:

但是我们不能改变符号的内容:

:foo[0]  = ?b # Raises an error
  1. A Ruby symbol is the keyword for a keyword argument
  1. Ruby 符号是关键字参数的关键字

When passing keyword arguments to a Ruby function, we specify the keywords using symbols:

将关键字参数传递给 Ruby 函数时,我们使用符号指定关键字:

# Build a URL for 'bug' using Rails.
url_for :controller => 'bug',
        :action => 'show',
        :id => bug.id
  1. A Ruby symbol is an excellent choice for a hash key
  1. Ruby 符号是散列键的绝佳选择

Typically, we'll use symbols to represent the keys of a hash table:

通常,我们将使用符号来表示哈希表的键:

options = {}
options[:auto_save]     = true
options[:show_comments] = false

回答by Yurii

Here is a nice strings vs symbols benchmark I found at codecademy:

这是我在 codecademy 找到的一个不错的字符串与符号基准测试:

require 'benchmark'

string_AZ = Hash[("a".."z").to_a.zip((1..26).to_a)]
symbol_AZ = Hash[(:a..:z).to_a.zip((1..26).to_a)]

string_time = Benchmark.realtime do
  1000_000.times { string_AZ["r"] }
end

symbol_time = Benchmark.realtime do
  1000_000.times { symbol_AZ[:r] }
end

puts "String time: #{string_time} seconds."
puts "Symbol time: #{symbol_time} seconds."

The output is:

输出是:

String time: 0.21983 seconds.
Symbol time: 0.087873 seconds.

回答by Oshan Wisumperuma

  • use symbols as hash key identifiers

    {key: "value"}

  • symbols allow you to call the method in a different order

  • 使用符号作为哈希键标识符

    {key: "value"}

  • 符号允许您以不同的顺序调用方法

     def write(file:, data:, mode: "ascii")
          # removed for brevity
     end
     write(data: 123, file: "test.txt")
  • freeze to keep as a string and save memory

    label = 'My Label'.freeze

  • 冻结以保留为字符串并节省内存

    label = 'My Label'.freeze