将导入的 json 数据导入数据框

Question

提问by Andrew Staroscik

I have a file containing over 1500 json objects that I want to work with in R. I've been able to import the data as a list, but am having trouble coercing it into a useful structure. I want to create a data frame containing a row for each json object and a column for each key:value pair.

我有一个包含超过 1500 个 json 对象的文件，我想在 R 中使用这些对象。我已经能够将数据作为列表导入，但无法将其强制转换为有用的结构。我想创建一个数据框，其中包含每个 json 对象的一行和每个键：值对的列。

I've recreated my situation with this small, fake data set:

我用这个小的假数据集重现了我的情况：

[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]

Some features of the data:

数据的一些特点：

The objects all contain the same number of key:value pairs although some of the values are null
There are two non-numeric columns per object (name and group)
name is the unique identifier, there are 10 or so groups
many of the name and group entires contain spaces, commas and other punctuation.

这些对象都包含相同数量的键：值对，尽管其中一些值为 null
每个对象有两个非数字列（名称和组）
name 是唯一标识符，大约有 10 个组
许多名称和组词都包含空格、逗号和其他标点符号。

Based on this question: R list(structure(list())) to data frame, I tried the following:

基于这个问题：R list(structure(list())) to data frame，我尝试了以下操作：

json_file <- "test.json"
json_data <- fromJSON(json_file)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))

With both my real data and this fake data, the last line give me this error:

有了我的真实数据和这个假数据，最后一行给了我这个错误：

Error in data.frame(name = "Doe, John", group = "Red", `age (y)` = 24,  : 
  arguments imply differing number of rows: 1, 0

Answer 1

回答by SchaunW

You just need to replace your NULLs with NAs:

您只需要用 NA 替换 NULL：

require(RJSONIO)    

json_file <-  '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
    {"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
    {"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
    {"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
    {"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
    {"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
    {"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'


json_file <- fromJSON(json_file)

json_file <- lapply(json_file, function(x) {
  x[sapply(x, is.null)] <- NA
  unlist(x)
})

Once you have a non-null value for each element, you can call rbindwithout getting an error:

一旦每个元素都有一个非空值，就可以调用rbind而不会出错：

do.call("rbind", json_file)
     name           group    age (y) height (cm) wieght (kg) score
[1,] "Doe, John"    "Red"    "24"    "182"       "74.8"      NA   
[2,] "Doe, Jane"    "Green"  "30"    "170"       "70.1"      "500"
[3,] "Smith, Joan"  "Yellow" "41"    "169"       "60"        NA   
[4,] "Brown, Sam"   "Green"  "22"    "183"       "75"        "865"
[5,] "Jones, Larry" "Green"  "31"    "178"       "83.9"      "221"
[6,] "Murray, Seth" "Red"    "35"    "172"       "76.2"      "413"
[7,] "Doe, Jane"    "Yellow" "22"    "164"       "68"        "902"

Answer 2

回答by SymbolixAU

This is very simple if you use either library(jsonlite)or library(jsonify)

如果您使用library(jsonlite)或library(jsonify)

Both of these handle the nullvalues and converts them to NA, and they preserve the data types.

这两者都处理null值并将它们转换为NA，并且它们保留数据类型。

Data

数据

json_file <-  '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'

jsonlite

library(jsonlite)
jsonlite::fromJSON( json_file )
#           name  group age (y) height (cm) wieght (kg) score
# 1    Doe, John    Red      24         182        74.8    NA
# 2    Doe, Jane  Green      30         170        70.1   500
# 3  Smith, Joan Yellow      41         169        60.0    NA
# 4   Brown, Sam  Green      22         183        75.0   865
# 5 Jones, Larry  Green      31         178        83.9   221
# 6 Murray, Seth    Red      35         172        76.2   413
# 7    Doe, Jane Yellow      22         164        68.0   902

str( jsonlite::fromJSON( json_file ) )
# 'data.frame': 7 obs. of  6 variables:
# $ name       : chr  "Doe, John" "Doe, Jane" "Smith, Joan" "Brown, Sam" ...
# $ group      : chr  "Red" "Green" "Yellow" "Green" ...
# $ age (y)    : int  24 30 41 22 31 35 22
# $ height (cm): int  182 170 169 183 178 172 164
# $ wieght (kg): num  74.8 70.1 60 75 83.9 76.2 68
# $ score      : int  NA 500 NA 865 221 413 902

jsonify

library(jsonify)
jsonify::from_json( json_file )
#           name  group age (y) height (cm) wieght (kg) score
# 1    Doe, John    Red      24         182        74.8    NA
# 2    Doe, Jane  Green      30         170        70.1   500
# 3  Smith, Joan Yellow      41         169        60.0    NA
# 4   Brown, Sam  Green      22         183        75.0   865
# 5 Jones, Larry  Green      31         178        83.9   221
# 6 Murray, Seth    Red      35         172        76.2   413
# 7    Doe, Jane Yellow      22         164        68.0   90


str( jsonify::from_json( json_file ) )
# 'data.frame': 7 obs. of  6 variables:
# $ name       : chr  "Doe, John" "Doe, Jane" "Smith, Joan" "Brown, Sam" ...
# $ group      : chr  "Red" "Green" "Yellow" "Green" ...
# $ age (y)    : int  24 30 41 22 31 35 22
# $ height (cm): int  182 170 169 183 178 172 164
# $ wieght (kg): num  74.8 70.1 60 75 83.9 76.2 68
# $ score      : int  NA 500 NA 865 221 413 902

Answer 3

回答by lenoch

To remove null values use parameter nullValue

要删除空值，请使用参数 nullValue

json_data <- fromJSON(json_file, nullValue = NA)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))

this way there won′t be any unnecessary quotes in your output

这样你的输出中就不会有任何不必要的引号

Answer 4

回答by Ahsan Habib

library(rjson)
Lines <- readLines("yelp_academic_dataset_business.json") 
business <- as.data.frame(t(sapply(Lines, fromJSON)))

You may try this to load JSON data into R

您可以尝试将 JSON 数据加载到 R

Answer 5

回答by YH Wu

dplyr::bind_rows(fromJSON(file_name))

将导入的 json 数据导入数据框

提问by Andrew Staroscik

回答by SchaunW

回答by SymbolixAU

Data

数据

jsonlite

jsonlite

jsonify

jsonify

回答by lenoch

回答by Ahsan Habib

回答by YH Wu

相关推荐

最近更新

标签

将导入的 json 数据导入数据框

提问by Andrew Staroscik

回答by SchaunW

回答by SymbolixAU

Data

数据

jsonlite

jsonlite

jsonify

jsonify

回答by lenoch

回答by Ahsan Habib

回答by YH Wu

相关推荐

使用绝对或相对路径在 curl 请求中发送 json 文件

获取 JSON 格式的 OData $metadata

JSON、REST、SOAP、WSDL 和 SOA：它们如何链接在一起

如何使用 json.net 将额外的属性添加到序列化的 JSON 字符串中？

相关推荐

最近更新

标签