scala Apache-Spark:map(_._2) 的简写是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29246440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 06:59:48  来源:igfitidea点击:

Apache-Spark : What is map(_._2) shorthand for?

scalaapache-spark

提问by chenzhongpu

I read a project's source code, found:

我阅读了一个项目的源代码,发现:

val sampleMBR = inputMBR.map(_._2).sample

inputMBRis a tuple.

inputMBR是一个元组。

the function map's definition is :

函数map的定义是:

map[U classTag](f:T=>U):RDD[U]

it seems that map(_._2)is the shorthand for map(x => (x._2)).

这似乎map(_._2)map(x => (x._2)).

Anyone can tell me rules of those shorthand ?

谁能告诉我这些速记的规则?

回答by Holden

The _ syntax can be a bit confusing. When _ is used on its own it represents an argument in the anonymous function. So if we working on pairs: map(_._2 + _._2)would be shorthand for map(x, y => x._2 + y._2). When _ is used as part of a function name (or value name) it has no special meaning. In this case x._2returns the second element of a tuple (assuming x is a tuple).

_ 语法可能有点混乱。当 _ 单独使用时,它代表匿名函数中的一个参数。因此,如果我们处理对: map(_._2 + _._2)将是map(x, y => x._2 + y._2). 当 _ 用作函数名称(或值名称)的一部分时,它没有特殊含义。在这种情况下,x._2返回元组的第二个元素(假设 x 是一个元组)。

回答by marekinfo

collection.map(_._2) emits a second component of the tuple. Example from pure Scala (Spark RDDs work the same way):

collection.map(_._2) 发出元组的第二个组件。来自纯 Scala 的示例(Spark RDD 的工作方式相同):

scala> val zipped = (1 to 10).zip('a' to 'j')
zipped: scala.collection.immutable.IndexedSeq[(Int, Char)] = Vector((1,a), (2,b), (3,c), (4,d), (5,e), (6,f), (7,g), (8,h), (9,i), (10,j))

scala> val justLetters = zipped.map(_._2)
justLetters: scala.collection.immutable.IndexedSeq[Char] = Vector(a, b, c, d, e, f, g, h, i, j)

回答by u290629

Two underscores in '_._2' are different.

' _._2' 中的两个下划线不同。

First '_' is for placeholderof anonymous function; Second '_2' is memberof case class Tuple.

第一个 ' _' 是匿名函数的占位符;第二个 ' _2' 是case 类的成员Tuple

Something like:

就像是:

case class Tuple3 (_1: T1, _2: T2, _3: T3) {...}

案例类元组3(_1:T1,_2:T2,_3:T3){...}

回答by Nihat Hosgur

The first '_' is referring what is mapped to and since what is mapped to is a tuple you might call any function within the tuple and one of the method is '_2' so what below tells us transform input into it's second attribute.

第一个“_”指的是映射到的内容,并且由于映射到的是元组,因此您可以调用元组中的任何函数,其中一个方法是“_2”,因此下面的内容告诉我们将输入转换为它的第二个属性。

回答by chenzhongpu

I have found the solutions.

我已经找到了解决方案。

First the underscorehere is as placeholder.

首先underscore这里是作为placeholder

To make a function literal even more concise, you can use underscores as placeholders for one or more parameters, so long as each parameter appears only one time within the function literal.

为了使函数字面量更加简洁,您可以使用下划线作为一个或多个参数的占位符,只要每个参数在函数字面量中只出现一次。

See more about underscorein Scala at What are all the uses of an underscore in Scala?.

查看更多有关underscoreScala的在什么是斯卡拉下划线的所有用途?.