java 在 avro 文件中存储空值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45662469/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 08:49:57  来源:igfitidea点击:

Storing null values in avro files

javaavroavro-tools

提问by mba12

I have some json data that looks like this:

我有一些看起来像这样的 json 数据:

  {
    "id": 1998983092,
    "name": "Test Name 1",
    "type": "search string",
    "creationDate": "2017-06-06T13:49:15.091+0000",
    "lastModificationDate": "2017-06-28T14:53:19.698+0000",
    "lastModifiedUsername": "[email protected]",
    "lockedQuery": false,
    "lockedByUsername": null
  }

I am able to add the lockedQuery null value to a GenericRecord object without issue.

我可以毫无问题地将lockedQuery 空值添加到GenericRecord 对象。

GenericRecord record = new GenericData.Record(schema);
if(json.isNull("lockedQuery")){
    record.put("lockedQuery", null);
} 

However, later when I attempt to write that GenericRecord object to an avro file I get a null pointer exception.

但是,稍后当我尝试将该 GenericRecord 对象写入 avro 文件时,我得到了一个空指针异常。

File file = new File("~/test.arvo");
DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<>(schema);
DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<>(datumWriter);
dataFileWriter.create(schema, file);
for(GenericRecord record: masterList) {
    dataFileWriter.append(record); // NULL POINTER HERE
}

When I run that code I get the following exception. Any tips on how to process a null value into an Avro file much appreciated. Thanks in advance.

当我运行该代码时,出现以下异常。非常感谢有关如何将空值处理为 Avro 文件的任何提示。提前致谢。

java.lang.NullPointerException: null of boolean in field lockedQuery of 
com.mydomain.test1.domain.MyAvroRecord
Exception in thread "main" java.lang.RuntimeException: 
org.apache.avro.file.DataFileWriter$AppendWriteException: 
java.lang.NullPointerException: null of boolean in field lockedQuery of 
com.mydomain.test1.domain.MyAvroRecord
at com.mydomain.avro.App.main(App.java:198)
Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException: 
java.lang.NullPointerException: null of boolean in field lockedQuery of 
com.mydomain.test1.domain.MyAvroRecord
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308)

EDIT: here is the MyAvroRecord

编辑:这是 MyAvroRecord

public class MyAvroRecord {
    long id;
    String name;
    String type;
    Date timestamp;
    Date lastModifcationDate;
    String lastModifiedUsername;
    Boolean lockedQuery;

回答by Vladimir Kroz

To be able to set Avro field to nullyou should allow this in Avro schema, by adding nullas one of the possible types of the field. Take a look on example from Avro documentation:

为了能够将 Avro 字段设置为null您应该允许在 Avro 架构中这样做,方法是将其添加null为可能的字段类型之一。看看 Avro 文档中的示例:

{
  "type": "record",
  "name": "MyRecord",
  "fields" : [
    {"name": "userId", "type": "long"},              // mandatory field
    {"name": "userName", "type": ["null", "string"]} // optional field 
  ]
}

here userNameis declared as composite type which could be either nullor string. This kind of definition allows to set userNamefield to null. As contrast userIdcan only contain long values, hence attempt to set userIdto null will result in NullPointerException.

hereuserName被声明为复合类型,它可以是nullstring。这种定义允许将userName字段设置为空。由于对比度userId只能包含长值,因此尝试设置userId为 null 将导致NullPointerException.

回答by soymsk

I have this issue too and now resolved it.

我也有这个问题,现在解决了。

I found @Nullableannotation in Apache Avroto declare the field is nullable.

@NullableApache Avro 中找到了声明该字段可为空的注释。

So, in this example, we should

所以,在这个例子中,我们应该

import org.apache.avro.reflect.Nullable;

public class MyAvroRecord {
    long id;
    String name;
    String type;
    Date timestamp;
    Date lastModifcationDate;
    String lastModifiedUsername;
    @Nullable
    Boolean lockedQuery;
}