scala 解析json时由于输入结束而没有要映射的内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43102537/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 09:09:54  来源:igfitidea点击:

No content to map due to end-of-input when parsing json

jsonscalaapache-sparkplayframework

提问by xyin

I was using the play JSON library tools to parse a JSON data in Spark, and got the following error message. Does anyone have any clue about possible cause of this error? If this is due to a bad JSON record, how can I identify the bad record? Thanks!

我在 Spark 中使用 play JSON 库工具解析 JSON 数据,并收到以下错误消息。有没有人对这个错误的可能原因有任何线索?如果这是由于错误的 JSON 记录造成的,我如何识别错误的记录?谢谢!

Here is the major script I used to parse the JSON data:

这是我用来解析 JSON 数据的主要脚本:

import play.api.libs.json._
val jsonData = distdata.map(line => Json.parse(line)) //line 194 of script parseJson_v14.scala
val filteredData = jsonData.map(json => (json \ "QueryStringParameters" \ "pr").asOpt[String].orNull).countByValue()

Variable distdata is a rdd of text format JSON data, variable jsonData is a rdd of JsValue data. Since Spark transformation is lazy, the error didn't jump out until the 2nd command is executed to create the variable filteredData, and according to the error message, the error comes from the the 1st command where I create the variable jsonData.

变量distdata是文本格式JSON数据的rdd,变量jsonData是JsValue数据的rdd。由于Spark转换是懒惰的,所以直到执行第二条命令创建变量filteredData时错误才跳出,根据错误信息,错误来自我创建变量jsonData的第一条命令。

[2017-03-29 14:55:39.616]-[Logging$class.logWarning]-[WARN]: Lost task 42.0 in stage 1.0 (TID 90, 10.119.126.114): com.fasterxml.Hymanson.databind.JsonMappingException: No content to map due to end-of-input at [Source: ; line: 1, column: 1] at com.fasterxml.Hymanson.databind.JsonMappingException.from(JsonMappingException.java:148) at com.fasterxml.Hymanson.databind.ObjectMapper._initForReading(ObjectMapper.java:3110) at com.fasterxml.Hymanson.databind.ObjectMapper._readValue(ObjectMapper.java:3024) at com.fasterxml.Hymanson.databind.ObjectMapper.readValue(ObjectMapper.java:1652) at play.api.libs.json.Hymanson.HymansonJson$.parseJsValue(HymansonJson.scala:226) at play.api.libs.json.Json$.parse(Json.scala:21) at parseJson_v14$$anonfun$$anonfun$$anonfun$apply.apply(parseJson_v14.scala:194) at parseJson_v14$$anonfun$$anonfun$$anonfun$apply.apply(parseJson_v14.scala:194) at scala.collection.Iterator$$anon.next(Iterator.scala:328) at scala.collection.Iterator$$anon.hasNext(Iterator.scala:389) at scala.collection.Iterator$$anon.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon.hasNext(Iterator.scala:327) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$$anonfun$$anonfun$apply.apply$mcV$sp(PairRDDFunctions.scala:1197) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$$anonfun$$anonfun$apply.apply(PairRDDFunctions.scala:1197) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$$anonfun$$anonfun$apply.apply(PairRDDFunctions.scala:1197) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$$anonfun.apply(PairRDDFunctions.scala:1205) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$$anonfun.apply(PairRDDFunctions.scala:1185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

回答by Andriy Kuba

Check if you have no blank lines in distdataand that you have all JSON object in one line, like

检查您是否没有空行,distdata并且您在一行中有所有 JSON 对象,例如

{"id":"121", "name":"robot 1"}
{"id":"122", "name":"robot 2"}

opposite to

对面

{"id":"121", "name":
"robot 1"}
{"id":"122", "name":
"robot 2"}