如何将 Pandas 列中的 JSON 数据转换为新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46391291/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:31:36  来源:igfitidea点击:

How to convert JSON data inside a pandas column into new columns

pythonjsonpandas

提问by user8149657

I have this short version of ADSB json data and would like to convert it into dataFrame columns as Icao, Alt, Lat, Long, Spd, Cou.....

我有这个简短版本的 ADSB json 数据,并希望将其转换为 dataFrame 列,如 Icao、Alt、Lat、Long、Spd、Cou .....

After Alperen told me to do this

在 Alperen 告诉我这样做之后

df = pd.read_json('2016-06-20-2359Z.json', lines=True), 

I can load it into a DataFrame. However, df.acListis

我可以将它加载到 DataFrame 中。然而,df.acList

[{'Id': 10537990, 'Rcvr': 1, 'HasSig': False, ...
Name: acList, dtype: object

How can I get the Icao, Alt, Lat, Long, Spd, Cou data?

如何获取 Icao、Alt、Lat、Long、Spd、Cou 数据?

 "src":1,
   "feeds":[  
      {  
         "id":1,
         "name":"ADSBexchange.com",
         "polarPlot":false
      }
   ],
   "srcFeed":1,
   "showSil":true,
   "showFlg":true,
   "showPic":true,
   "flgH":20,
   "flgW":85,
   "acList":[  
      {  
         "Id":11281748,
         "Rcvr":1,
         "HasSig":false,
         "Icao":"AC2554",
         "Bad":false,
         "Reg":"N882AS",
         "FSeen":"\/Date(1466467166951)\/",
         "TSecs":3,
         "CMsgs":1,
         "AltT":0,
         "Tisb":false,
         "TrkH":false,
         "Type":"CRJ2",
         "Mdl":"2001 
BOMBARDIER INC 
 CL-600-2B19",
         "Man":"Bombardier",
         "CNum":"7503",
         "Op":"EXPRESSJET AIRLINES INC - ATLANTA, GA",
         "OpIcao":"ASQ",
         "Sqk":"",
         "VsiT":0,
         "WTC":2,
         "Species":1,
         "Engines":"2",
         "EngType":3,
         "EngMount":1,
         "Mil":false,
         "Cou":"United States",
         "HasPic":false,
         "Interested":false,
         "FlightsCount":0,
         "Gnd":false,
         "SpdTyp":0,
         "CallSus":false,
         "TT":"a",
         "Trt":1,
         "Year":"2001"
      },
      {  
         "Id":11402205,
         "Rcvr":1,
         "HasSig":true,
         "Sig":110,
         "Icao":"ADFBDD",
         "Bad":false,
         "FSeen":"\/Date(1466391940977)\/",
         "TSecs":75229,
         "CMsgs":35445,
         "Alt":8025,
         "GAlt":8025,
         "AltT":0,
         "Call":"TEST1234",
         "Tisb":false,
         "TrkH":false,
         "Sqk":"0262",
         "Help":false,
         "VsiT":0,
         "WTC":0,
         "Species":0,
         "EngType":0,
         "EngMount":0,
         "Mil":true,
         "Cou":"United States",
         "HasPic":false,
         "Interested":false,
         "FlightsCount":0,
         "Gnd":true,
         "SpdTyp":0,
         "CallSus":false,
         "TT":"a",
         "Trt":1
      }
   ],
   "totalAc":4231,
   "lastDv":"636019887431643594",
   "shtTrlSec":61,
   "stm":1466467170029
}

回答by Sergey Bushmanov

If you already have your data in acListcolumn in a pandas DataFrame, simply do:

如果您已经将数据acList列在 Pandas DataFrame 的列中,只需执行以下操作:

import pandas as pd
pd.io.json.json_normalize(df.acList[0])

Alt AltT    Bad CMsgs   CNum    Call    CallSus Cou EngMount    EngType ... Sqk TSecs   TT  Tisb    TrkH    Trt Type    VsiT    WTC Year
0   NaN 0   False   1   7503    NaN False   United States   1   3   ...     3   a   False   False   1   CRJ2    0   2   2001
1   8025.0  0   False   35445   NaN TEST1234    False   United States   0   0   ... 0262    75229   a   False   False   1   NaN 0   0   NaN

回答by ThinkBonobo

@Sergey's answer solved the issue for me but I was running into issues because the json in my data frame column was kept as a string and not as an object. I had to add the additional step of mapping the column:

@Sergey 的回答为我解决了这个问题,但我遇到了问题,因为我的数据框列中的 json 被保存为字符串而不是对象。我必须添加映射列的附加步骤:

import json
import pandas as pd
pd.io.json.json_normalize(df.acList.apply(json.loads))

回答by blockw

I can't comment yet on ThinkBonobo's answer but in case the JSON in the column isn't exactly a dictionary you can keep doing .applyuntil it is. So in my case

我还不能对 ThinkBonobo 的回答发表评论,但如果列中的 JSON 不完全是字典,您可以继续这样做,.apply直到它成为。所以就我而言

import json
import pandas as pd

json_normalize(
    df
    .theColumnWithJson
    .apply(json.loads)
    .apply(lambda x: x[0]) # the inner JSON is list with the dictionary as the only item
)