scala (Spark) 对象 {name} 不是包 org.apache.spark.ml 的成员

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40281840/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 08:47:49  来源:igfitidea点击:

(Spark) object {name} is not a member of package org.apache.spark.ml

scalaapache-sparksbtapache-spark-mllib

提问by Yusata

I'm trying to run self-contained application using scala on apache spark based on example here: http://spark.apache.org/docs/latest/ml-pipeline.html

我正在尝试根据示例在 apache spark 上使用 scala 运行自包含应用程序:http: //spark.apache.org/docs/latest/ml-pipeline.html

Here's my complete code:

这是我的完整代码:

import org.apache.spark.ml.classification.LogisticRegression
import org.apache.spark.ml.linalg.{Vector, Vectors}
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.sql.Row

object mllibexample1 {
  def main(args: Array[String]) {
    val spark = SparkSession
      .builder()
      .master("local[*]")
      .appName("logistic regression example 1")
      .getOrCreate()


    val training = spark.createDataFrame(Seq(
      (1.0, Vectors.dense(0.0, 1.1, 0.1)),
      (0.0, Vectors.dense(2.0, 1.0, -1.0)),
      (0.0, Vectors.dense(2.0, 1.3, 1.0)),
      (1.0, Vectors.dense(0.0, 1.2, -0.5))
    )).toDF("label", "features")

    val lr = new LogisticRegression()

    println("LogisticRegression parameters:\n" + lr.explainParams() + "\n")

    lr.setMaxIter(100)
      .setRegParam(0.01)

    val model1 = lr.fit(training)

    println("Model 1 was fit using parameters: " + model1.parent.extractParamMap)
  }
}

Dependencies in build.sbt:

build.sbt 中的依赖:

name := "example"
version := "1.0.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "2.0.1",
    "org.apache.spark" %% "spark-sql" % "2.0.1",
    "org.apache.spark" %% "spark-mllib-local" % "2.0.1",
    "com.github.fommil.netlib" % "all" % "1.1.2"
  )

However after running the program in sbt shell, I got the following error:

但是,在 sbt shell 中运行程序后,出现以下错误:

[info] Compiling 1 Scala source to /dataplatform/example/target/scala-2.11/classes...
[error] /dataplatform/example/src/main/scala/mllibexample1.scala:1: object classification is not a member of package org.apache.spark.ml
[error] import org.apache.spark.ml.classification.LogisticRegression
[error]                            ^
[error] /dataplatform/example/src/main/scala/mllibexample1.scala:3: object param is not a member of package org.apache.spark.ml
[error] import org.apache.spark.ml.param.ParamMap
[error]                            ^
[error] /dataplatform/example/src/main/scala/mllibexample1.scala:8: not found: value SparkSession
[error]     val spark = SparkSession
[error]                 ^
[error] /dataplatform/example/src/main/scala/mllibexample1.scala:22: not found: type LogisticRegression
[error]     val lr = new LogisticRegression()

I can successfully run this code in spark interactive shell. Did I miss something in *.sbt file ?

我可以在 spark 交互式 shell 中成功运行此代码。我是否遗漏了 *.sbt 文件中的某些内容?

Thanks, Bayu

谢谢,巴宇

回答by

You missed a MLlib dependency:

您错过了 MLlib 依赖项:

"org.apache.spark" %% "spark-mllib" % "2.0.1"

Local is not enough.

本地是不够的。

回答by Rajkumar S

I had the same issue and I am having a Maven Scala project.

我有同样的问题,我有一个 Maven Scala 项目。

I used the below Maven dependency. After adding this maven dependency, the issue was resolved.

我使用了以下 Maven 依赖项。添加此 maven 依赖项后,问题已解决。

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_2.11</artifactId>
            <version>2.0.2</version>
        </dependency