Ruby-on-rails #inject on hashes 被认为是好的风格吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3230863/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is #inject on hashes considered good style?
提问by averell
Inside the Rails code, people tend to use the Enumerable#injectmethod to create hashes, like this:
在 Rails 代码中,人们倾向于使用该Enumerable#inject方法来创建哈希,如下所示:
somme_enum.inject({}) do |hash, element|
hash[element.foo] = element.bar
hash
end
While this appears to have become a common idiom, does anyone see an advantage over the "naive" version, which would go like:
虽然这似乎已成为一种常见的习语,但有没有人认为它比“天真”版本有优势,比如:
hash = {}
some_enum.each { |element| hash[element.foo] = element.bar }
The only advantage I see for the first version is that you do it in a closed block and you don't (explicitly) initialize the hash. Otherwise it abuses a method unexpectedly, is harder to understand and harder to read. So why is it so popular?
我在第一个版本中看到的唯一优点是您在封闭的块中执行它并且您没有(明确地)初始化散列。否则出乎意料的滥用方法,更难理解,更难阅读。那么它为什么如此受欢迎呢?
采纳答案by Aidan Cully
Beauty is in the eye of the beholder. Those with some functional programming background will probably prefer the inject-based method (as I do), because it has the same semantics as the foldhigher-order function, which is a common way of calculating a single result from multiple inputs. If you understand inject, then you should understand that the function is being used as intended.
美在旁观者的眼中。那些有一些函数式编程背景的人可能更喜欢inject-based 方法(就像我一样),因为它与fold高阶函数具有相同的语义,这是从多个输入计算单个结果的常用方法。如果您了解inject,那么您应该了解正在按预期使用该功能。
As one reason why this approach seems better (to my eyes), consider the lexical scope of the hashvariable. In the inject-based method, hashonly exists within the body of the block. In the each-based method, the hashvariable inside the block needs to agree with some execution context defined outside the block. Want to define another hash in the same function? Using the injectmethod, it's possible to cut-and-paste the inject-based code and use it directly, and it almost certainly won't introduce bugs (ignoring whether one should use C&P during editing - people do). Using the eachmethod, you need to C&P the code, and rename the hashvariable to whatever name you wanted to use - the extra step means this is more prone to error.
作为这种方法看起来更好(在我看来)的一个原因,请考虑hash变量的词法范围。在inject-based 方法中,hash只存在于块体中。在each-based 方法中,hash块内的变量需要与块外定义的某些执行上下文一致。想在同一个函数中定义另一个散列?使用该inject方法,可以将inject基于的代码进行剪切和粘贴并直接使用,并且几乎肯定不会引入错误(忽略在编辑时是否应该使用 C&P - 人们会这样做)。使用该each方法,您需要 C&P 代码,并将hash变量重命名为您想要使用的任何名称 - 额外的步骤意味着这更容易出错。
回答by fearless_fool
As Aleksey points out, Hash#update()is slower than Hash#store(), but that got me thinking about the overall efficiency of #inject()vs a straight #eachloop, so I benchmarked a few things:
正如 Aleksey 指出的,Hash#update()比 慢Hash#store(),但这让我想到了#inject()vs 直线#each循环的整体效率,所以我对一些事情进行了基准测试:
require 'benchmark'
module HashInject
extend self
PAIRS = 1000.times.map {|i| [sprintf("s%05d",i).to_sym, i]}
def inject_store
PAIRS.inject({}) {|hash, sym, val| hash[sym] = val ; hash }
end
def inject_update
PAIRS.inject({}) {|hash, sym, val| hash.update(val => hash) }
end
def each_store
hash = {}
PAIRS.each {|sym, val| hash[sym] = val }
hash
end
def each_update
hash = {}
PAIRS.each {|sym, val| hash.update(val => hash) }
hash
end
def each_with_object_store
PAIRS.each_with_object({}) {|pair, hash| hash[pair[0]] = pair[1]}
end
def each_with_object_update
PAIRS.each_with_object({}) {|pair, hash| hash.update(pair[0] => pair[1])}
end
def by_initialization
Hash[PAIRS]
end
def tap_store
{}.tap {|hash| PAIRS.each {|sym, val| hash[sym] = val}}
end
def tap_update
{}.tap {|hash| PAIRS.each {|sym, val| hash.update(sym => val)}}
end
N = 10000
Benchmark.bmbm do |x|
x.report("inject_store") { N.times { inject_store }}
x.report("inject_update") { N.times { inject_update }}
x.report("each_store") { N.times {each_store }}
x.report("each_update") { N.times {each_update }}
x.report("each_with_object_store") { N.times {each_with_object_store }}
x.report("each_with_object_update") { N.times {each_with_object_update }}
x.report("by_initialization") { N.times {by_initialization}}
x.report("tap_store") { N.times {tap_store }}
x.report("tap_update") { N.times {tap_update }}
end
end
And the results:
结果:
Rehearsal -----------------------------------------------------------
inject_store 10.510000 0.120000 10.630000 ( 10.659169)
inject_update 8.490000 0.190000 8.680000 ( 8.696176)
each_store 4.290000 0.110000 4.400000 ( 4.414936)
each_update 12.800000 0.340000 13.140000 ( 13.188187)
each_with_object_store 5.250000 0.110000 5.360000 ( 5.369417)
each_with_object_update 13.770000 0.340000 14.110000 ( 14.166009)
by_initialization 3.040000 0.110000 3.150000 ( 3.166201)
tap_store 4.470000 0.110000 4.580000 ( 4.594880)
tap_update 12.750000 0.340000 13.090000 ( 13.114379)
------------------------------------------------- total: 77.140000sec
user system total real
inject_store 10.540000 0.110000 10.650000 ( 10.674739)
inject_update 8.620000 0.190000 8.810000 ( 8.826045)
each_store 4.610000 0.110000 4.720000 ( 4.732155)
each_update 12.630000 0.330000 12.960000 ( 13.016104)
each_with_object_store 5.220000 0.110000 5.330000 ( 5.338678)
each_with_object_update 13.730000 0.340000 14.070000 ( 14.102297)
by_initialization 3.010000 0.100000 3.110000 ( 3.123804)
tap_store 4.430000 0.110000 4.540000 ( 4.552919)
tap_update 12.850000 0.330000 13.180000 ( 13.217637)
=> true
Enumerable#eachis faster than Enumerable#inject, and Hash#storeis faster than Hash#update. But the fastest of all is to pass an array in at initialization time:
Enumerable#each比 快Enumerable#inject,并且Hash#store比 快Hash#update。但最快的是在初始化时传入一个数组:
Hash[PAIRS]
If you're adding elements after the hash has been created, the winning version is exactly what the OP was suggesting:
如果您在创建哈希后添加元素,则获胜版本正是 OP 所建议的:
hash = {}
PAIRS.each {|sym, val| hash[sym] = val }
hash
But in that case, if you're a purist who wants a single lexical form, you can use #tapand #eachand get the same speed:
但在这种情况下,如果您是一个纯粹主义者,想要一个单一的词法形式,您可以使用#tapand#each并获得相同的速度:
{}.tap {|hash| PAIRS.each {|sym, val| hash[sym] = val}}
For those not familiar with tap, it creates a binding of the receiver (the new hash) inside the body, and finally returns the receiver (the same hash). If you know Lisp, think of it as Ruby's version of LET binding.
对于那些不熟悉的人tap,它会在主体内创建接收者(新散列)的绑定,最后返回接收者(相同的散列)。如果您了解 Lisp,请将其视为 Ruby 版本的 LET 绑定。
Since people have asked, here's the testing environment:
既然有人问了,这里是测试环境:
# Ruby version ruby 2.0.0p247 (2013-06-27) [x86_64-darwin12.4.0]
# OS Mac OS X 10.9.2
# Processor/RAM 2.6GHz Intel Core i7 / 8GB 1067 MHz DDR3
回答by fearless_fool
inject(aka reduce) has a long and respected place in functional programming languages. If you're ready to take the plunge, and want to understand a lot of Matz's inspiration for Ruby, you should read the seminal Structure and Interpretation of Computer Programs, available online at http://mitpress.mit.edu/sicp/.
inject(aka reduce) 在函数式编程语言中有着悠久而受人尊敬的地位。如果您准备冒险,并想了解 Matz 对 Ruby 的许多启发,您应该阅读计算机程序的开创性结构和解释,可在http://mitpress.mit.edu/sicp/在线获得。
Some programmers find it stylistically cleaner to have everything in one lexical package. In your hash example, using inject means you don't have to create an empty hash in a separate statement. What's more, the inject statement returns the result directly -- you don't have to remember that it's in the hash variable. To make that really clear, consider:
一些程序员发现将所有内容都放在一个词法包中在风格上更简洁。在您的散列示例中,使用注入意味着您不必在单独的语句中创建空散列。更重要的是,inject 语句直接返回结果——您不必记住它在哈希变量中。要真正清楚这一点,请考虑:
[1, 2, 3, 5, 8].inject(:+)
vs
对比
total = 0
[1, 2, 3, 5, 8].each {|x| total += x}
The first version returns the sum. The second version stores the sum in total, and as a programmer, you have to remember to use totalrather than the value returned by the .eachstatement.
第一个版本返回总和。第二个版本将 sum 存储在 中total,作为程序员,您必须记住使用total而不是.each语句返回的值。
One tiny addendum (and purely idomatic -- not about inject): your example might be better written:
一个小小的附录(纯粹是惯用的——不是关于注入):你的例子可能写得更好:
some_enum.inject({}) {|hash, element| hash.update(element.foo => element.bar) }
...since hash.update()returns the hash itself, you don't need the extra hashstatement at the end.
...由于hash.update()返回散列本身,您不需要最后的额外hash语句。
update
更新
@Aleksey has shamed me into benchmarking the various combinations. See my benchmarking reply elsewhere here. Short form:
@Aleksey 让我羞于对各种组合进行基准测试。请在此处查看我的其他地方的基准测试回复。简写:
hash = {}
some_enum.each {|x| hash[x.foo] = x.bar}
hash
is the fastest, but can be recast slightly more elegantly -- and it's just as fast -- as:
是最快的,但可以稍微更优雅地重铸——而且它也一样快——如下:
{}.tap {|hash| some_enum.each {|x| hash[x.foo] = x.bar}}
回答by Alexey
I have just found in
Ruby inject with initial being a hasha suggestion to use each_with_objectinstead of inject:
我刚刚在Ruby 注入中发现
初始是一个散列,建议使用each_with_object而不是inject:
hash = some_enum.each_with_object({}) do |element, h|
h[element.foo] = element.bar
end
Seems natural to me.
对我来说似乎很自然。
Another way, using tap:
另一种方式,使用tap:
hash = {}.tap do |h|
some_enum.each do |element|
h[element.foo] = element.bar
end
end
回答by Rodel30
If you are returning a hash, using merge can keep it cleaner so you don't have to return the hash afterward.
如果您要返回一个散列,使用合并可以保持它更干净,因此您之后不必返回散列。
some_enum.inject({}){|h,e| h.merge(e.foo => e.bar) }
If your enum is a hash, you can get key and value nicely with the (k,v).
如果您的枚举是散列,则可以使用 (k,v) 很好地获取键和值。
some_hash.inject({}){|h,(k,v)| h.merge(k => do_something(v)) }
回答by Matt Briggs
I think it has to do with people not fully understanding when to use reduce. I agree with you, each is the way it should be
我认为这与人们不完全了解何时使用reduce有关。我同意你的看法,每一个都是应该的方式

