scala Apache-Spark:map(_._2) 的简写是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29246440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Apache-Spark : What is map(_._2) shorthand for?
提问by chenzhongpu
I read a project's source code, found:
我阅读了一个项目的源代码,发现:
val sampleMBR = inputMBR.map(_._2).sample
inputMBRis a tuple.
inputMBR是一个元组。
the function map's definition is :
函数map的定义是:
map[U classTag](f:T=>U):RDD[U]
it seems that map(_._2)is the shorthand for map(x => (x._2)).
这似乎map(_._2)是map(x => (x._2)).
Anyone can tell me rules of those shorthand ?
谁能告诉我这些速记的规则?
回答by Holden
The _ syntax can be a bit confusing. When _ is used on its own it represents an argument in the anonymous function. So if we working on pairs:
map(_._2 + _._2)would be shorthand for map(x, y => x._2 + y._2). When _ is used as part of a function name (or value name) it has no special meaning. In this case x._2returns the second element of a tuple (assuming x is a tuple).
_ 语法可能有点混乱。当 _ 单独使用时,它代表匿名函数中的一个参数。因此,如果我们处理对:
map(_._2 + _._2)将是map(x, y => x._2 + y._2). 当 _ 用作函数名称(或值名称)的一部分时,它没有特殊含义。在这种情况下,x._2返回元组的第二个元素(假设 x 是一个元组)。
回答by marekinfo
collection.map(_._2) emits a second component of the tuple. Example from pure Scala (Spark RDDs work the same way):
collection.map(_._2) 发出元组的第二个组件。来自纯 Scala 的示例(Spark RDD 的工作方式相同):
scala> val zipped = (1 to 10).zip('a' to 'j')
zipped: scala.collection.immutable.IndexedSeq[(Int, Char)] = Vector((1,a), (2,b), (3,c), (4,d), (5,e), (6,f), (7,g), (8,h), (9,i), (10,j))
scala> val justLetters = zipped.map(_._2)
justLetters: scala.collection.immutable.IndexedSeq[Char] = Vector(a, b, c, d, e, f, g, h, i, j)
回答by u290629
Two underscores in '_._2' are different.
' _._2' 中的两个下划线不同。
First '_' is for placeholderof anonymous function; Second '_2' is memberof case class Tuple.
第一个 ' _' 是匿名函数的占位符;第二个 ' _2' 是case 类的成员Tuple。
Something like:
就像是:
case class Tuple3 (_1: T1, _2: T2, _3: T3) {...}
案例类元组3(_1:T1,_2:T2,_3:T3){...}
回答by Nihat Hosgur
The first '_' is referring what is mapped to and since what is mapped to is a tuple you might call any function within the tuple and one of the method is '_2' so what below tells us transform input into it's second attribute.
第一个“_”指的是映射到的内容,并且由于映射到的是元组,因此您可以调用元组中的任何函数,其中一个方法是“_2”,因此下面的内容告诉我们将输入转换为它的第二个属性。
回答by chenzhongpu
I have found the solutions.
我已经找到了解决方案。
First the underscorehere is as placeholder.
首先underscore这里是作为placeholder。
To make a function literal even more concise, you can use underscores as placeholders for one or more parameters, so long as each parameter appears only one time within the function literal.
为了使函数字面量更加简洁,您可以使用下划线作为一个或多个参数的占位符,只要每个参数在函数字面量中只出现一次。
See more about underscorein Scala at What are all the uses of an underscore in Scala?.
查看更多有关underscoreScala的在什么是斯卡拉下划线的所有用途?.

