Python pandas read_json:“如果使用所有标量值,则必须传递索引”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38380795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas read_json: "If using all scalar values, you must pass an index"
提问by Marco Fumagalli
I have some difficulty in importing a JSON file with pandas.
我在使用 Pandas 导入 JSON 文件时遇到了一些困难。
import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')
This is the error that I get:
这是我得到的错误:
ValueError: If using all scalar values, you must pass an index
The file structure is simplified like this:
文件结构简化如下:
{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}
It is from the machine learning course of University of Washington on Coursera. You can find the file here.
它来自Coursera上华盛顿大学的机器学习课程。您可以在此处找到该文件。
回答by ayhan
Try
尝试
ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')
That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count')
.
该文件仅包含值是标量的键值对。您可以使用ser.to_frame('count')
.
You can also do something like this:
你也可以做这样的事情:
import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
data = json.load(f)
Now data is a dictionary. You can pass it to a dataframe constructor like this:
现在数据是一本字典。您可以将其传递给数据帧构造函数,如下所示:
df = pd.DataFrame({'count': data})
回答by Adonis H.
You can do as @ayhan mention which will give you a column base format
你可以像@ayhan 提到的那样做,这会给你一个列基本格式
Or you can enclose the object in [ ] (source) as shown below to give you a row format that will be convenient if you are loading multiple values and planing on using matrix for your machine learning models.
或者,您可以将对象括在 [ ] ( source) 中,如下所示,为您提供一种行格式,如果您正在加载多个值并计划为您的机器学习模型使用矩阵,这种格式会很方便。
df = pd.DataFrame([data])
回答by Anant Gupta
I think what is happening is that the data in
我认为正在发生的事情是数据
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')
is being read as a string instead of a json
被读取为字符串而不是 json
{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}
is actually
实际上是
'{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}'
Since a string is a scalar, it wants you to load it as a json, you have to convert it to a dict which is exactly what the other response is doing
由于字符串是标量,它希望您将其作为 json 加载,您必须将其转换为 dict,这正是其他响应正在执行的操作
The best way is to do a json loads on the string to convert it to a dict and load it into pandas
最好的方法是在字符串上加载 json 以将其转换为 dict 并将其加载到 Pandas
myfile=f.read()
jsonData=json.loads(myfile)
df=pd.DataFrame(data)