Python pandas read_json:“如果使用所有标量值,则必须传递索引”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38380795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:42:18  来源:igfitidea点击:

pandas read_json: "If using all scalar values, you must pass an index"

pythonjsonpandas

提问by Marco Fumagalli

I have some difficulty in importing a JSON file with pandas.

我在使用 Pandas 导入 JSON 文件时遇到了一些困难。

import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')

This is the error that I get:

这是我得到的错误:

ValueError: If using all scalar values, you must pass an index

The file structure is simplified like this:

文件结构简化如下:

{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}

It is from the machine learning course of University of Washington on Coursera. You can find the file here.

它来自Coursera上华盛顿大学的机器学习课程。您可以在此处找到该文件。

回答by ayhan

Try

尝试

ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')

That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count').

该文件仅包含值是标量的键值对。您可以使用ser.to_frame('count').

You can also do something like this:

你也可以做这样的事情:

import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
    data = json.load(f)

Now data is a dictionary. You can pass it to a dataframe constructor like this:

现在数据是一本字典。您可以将其传递给数据帧构造函数,如下所示:

df = pd.DataFrame({'count': data})

回答by Adonis H.

You can do as @ayhan mention which will give you a column base format

你可以像@ayhan 提到的那样做,这会给你一个列基本格式

Method 1

方法一

Or you can enclose the object in [ ] (source) as shown below to give you a row format that will be convenient if you are loading multiple values and planing on using matrix for your machine learning models.

或者,您可以将对象括在 [ ] ( source) 中,如下所示,为您提供一种行格式,如果您正在加载多个值并计划为您的机器学习模型使用矩阵,这种格式会很方便。

df = pd.DataFrame([data])

Method 2

方法二

回答by Anant Gupta

I think what is happening is that the data in

我认为正在发生的事情是数据

map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')

is being read as a string instead of a json

被读取为字符串而不是 json

{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}

is actually

实际上是

'{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}'

Since a string is a scalar, it wants you to load it as a json, you have to convert it to a dict which is exactly what the other response is doing

由于字符串是标量,它希望您将其作为 json 加载,您必须将其转换为 dict,这正是其他响应正在执行的操作

The best way is to do a json loads on the string to convert it to a dict and load it into pandas

最好的方法是在字符串上加载 json 以将其转换为 dict 并将其加载到 Pandas

myfile=f.read()
jsonData=json.loads(myfile)
df=pd.DataFrame(data)