Python 将 Json 嵌套到具有特定格式的 Pandas DataFrame

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34341974/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:51:01  来源:igfitidea点击:

Nested Json to pandas DataFrame with specific format

pythonjsonpandasnestedformat

提问by figgy

i need to format the contents of a Json file in a certain format in a pandas DataFrame so that i can run pandassql to transform the data and run it through a scoring model.

我需要在 Pandas DataFrame 中以某种格式格式化 Json 文件的内容,以便我可以运行 pandassql 来转换数据并通过评分模型运行它。

file = C:\scoring_model\json.js (contents of 'file' are below)

file = C:\scoring_model\json.js ('file' 的内容如下)

{
"response":{
  "version":"1.1",
  "token":"dsfgf",
   "body":{
     "customer":{
         "customer_id":"1234567",
         "verified":"true"
       },
     "contact":{
         "email":"[email protected]",
         "mobile_number":"0123456789"
      },
     "personal":{
         "gender": "m",
         "title":"Dr.",
         "last_name":"Muster",
         "first_name":"Max",
         "family_status":"single",
         "dob":"1985-12-23",
     }
   }
 }

I need the dataframe to look like this (obviously all values on same row, tried to format it best as possible for this question):

我需要数据框看起来像这样(显然同一行上的所有值,都试图为这个问题尽可能地格式化它):

version | token | customer_id | verified | email      | mobile_number | gender |
1.1     | dsfgf | 1234567     | true     | [email protected] | 0123456789    | m      |

title | last_name | first_name |family_status | dob
Dr.   | Muster    | Max        | single       | 23.12.1985

I have looked at all the other questions on this topic, have tried various ways to load Json file into pandas

我已经查看了有关此主题的所有其他问题,并尝试了各种方法将 Json 文件加载到 Pandas 中

`with open(r'C:\scoring_model\json.js', 'r') as f:`
    c = pd.read_json(f.read())

 `with open(r'C:\scoring_model\json.js', 'r') as f:`
    c = f.readlines()

tried pd.Panel() in this solution Python Pandas: How to split a sorted dictionary in a column of a dataframe

在这个解决方案Python Pandas: How to split a sorted dictionary in a column of adataframe 中尝试了 pd.Panel()

with dataframe results from [yo = f.readlines()] thought about trying to split contents of each cell based on ("") and find a way to put the split contents into different columns but no luck so far. Your expertise is greatly appreciated. Thank you in advance.

使用 [yo = f.readlines()] 的数据帧结果考虑尝试根据 ("") 拆分每个单元格的内容,并找到一种方法将拆分的内容放入不同的列中,但到目前为止还没有运气。非常感谢您的专业知识。先感谢您。

采纳答案by Andy Hayden

If you load in the entire json as a dict (or list) e.g. using json.load, you can use json_normalize:

如果您将整个 json 作为字典(或列表)加载,例如使用 json.load,您可以使用json_normalize

In [11]: d = {"response": {"body": {"contact": {"email": "[email protected]", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}}

In [12]: df = pd.io.json.json_normalize(d)

In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1])

In [14]: df
Out[14]:
        email mobile_number customer_id verified         dob family_status first_name gender last_name title  token version
0  [email protected]    0123456789     1234567     true  1985-12-23        single        Max      m    Muster   Dr.  dsfgf     1.1