Python 类型错误：“字节”类型的对象不是 JSON 可序列化的

Question

提问by Zhibin

I just started programming Python. I want to use scrapy to create a bot，and it showed TypeError: Object of type 'bytes' is not JSON serializable when I run the project.

我刚开始编程 Python。我想用scrapy创建一个bot，运行项目时显示TypeError: Object of type 'bytes' is not JSON serializable。

import json
import codecs

class W3SchoolPipeline(object):

  def __init__(self):
      self.file = codecs.open('w3school_data_utf8.json', 'wb', encoding='utf-8')

  def process_item(self, item, spider):
      line = json.dumps(dict(item)) + '\n'
      # print line

      self.file.write(line.decode("unicode_escape"))
      return item

from scrapy.spiders import Spider
from scrapy.selector import Selector
from w3school.items import W3schoolItem

class W3schoolSpider(Spider):

    name = "w3school"
    allowed_domains = ["w3school.com.cn"]

    start_urls = [
        "http://www.w3school.com.cn/xml/xml_syntax.asp"
    ]

    def parse(self, response):
        sel = Selector(response)
        sites = sel.xpath('//div[@id="navsecond"]/div[@id="course"]/ul[1]/li')

    items = []
    for site in sites:
        item = W3schoolItem()
        title = site.xpath('a/text()').extract()
        link = site.xpath('a/@href').extract()
        desc = site.xpath('a/@title').extract()

        item['title'] = [t.encode('utf-8') for t in title]
        item['link'] = [l.encode('utf-8') for l in link]
        item['desc'] = [d.encode('utf-8') for d in desc]
        items.append(item)
        return items

Traceback：

追溯：

TypeError: Object of type 'bytes' is not JSON serializable
2017-06-23 01:41:15 [scrapy.core.scraper] ERROR: Error processing       {'desc': [b'\x
e4\xbd\xbf\xe7\x94\xa8 XSLT \xe6\x98\xbe\xe7\xa4\xba XML'],
 'link': [b'/xml/xml_xsl.asp'],
 'title': [b'XML XSLT']}

Traceback (most recent call last):
File  
"c:\users\administrator\appdata\local\programs\python\python36\lib\site-p
ackages\twisted\internet\defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
File "D:\LZZZZB\w3school\w3school\pipelines.py", line 19, in process_item
    line = json.dumps(dict(item)) + '\n'
File 
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\_
_init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
File 
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\e
ncoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
File  
"c:\users\administrator\appdata\local\programs\python\python36\lib\json\e
ncoder.py", line 257, in iterencode
    return _iterencode(o, 0)
File      
"c:\users\administrator\appdata\local\programs\python\python36\lib\
json\encoder.py", line 180, in default
    o.__class__.__name__)
  TypeError: Object of type 'bytes' is not JSON serializable

Answer 1

回答by Martijn Pieters

You are creating those bytesobjects yourself:

您bytes自己创建这些对象：

item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)

Each of those t.encode(), l.encode()and d.encode()calls creates a bytesstring. Do not do this, leave it to the JSON format to serialise these.

每个t.encode(),l.encode()和d.encode()调用都会创建一个bytes字符串。不要这样做，把它留给 JSON 格式来序列化这些。

Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the jsonmodule and the standardfile object returned by the open()call to handle encoding.

接下来，您正在犯其他几个错误；你在没有必要的地方编码太多了。把它留给json模块和由调用处理编码返回的标准文件对象open()。

You also don't need to convert your itemslist to a dictionary; it'll already be an object that can be JSON encoded directly:

您也不需要将items列表转换为字典；它已经是一个可以直接进行 JSON 编码的对象：

class W3SchoolPipeline(object):    
    def __init__(self):
        self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(item) + '\n'
        self.file.write(line)
        return item

I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape')it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd editionfor a good, free, book on learning Python 3.

我猜您遵循了假定 Python 2 的教程，而您使用的是 Python 3。我强烈建议你找一个不同的教程；它不仅是为过时的 Python 版本编写的，如果它鼓吹line.decode('unicode_escape')它是在教导一些极坏的习惯，这些习惯会导致难以追踪的错误。我可以建议您查看Think Python，第 2 版，这是一本关于学习 Python 3 的免费好书。

Answer 2

回答by Jordan Donovan

I was dealing with this issue today, and I knew that I had something encoded as a bytes object that I was trying to serialize as json with json.dump(my_json_object, write_to_file.json). my_json_objectin this case was a very large json object that I had created, so I had several dicts, lists, and strings to look at to find what was still in bytes format.

我今天正在处理这个问题，我知道我有一些编码为字节对象的东西，我试图用 .json 序列化为 json json.dump(my_json_object, write_to_file.json)。my_json_object在这种情况下，我创建了一个非常大的 json 对象，所以我有几个 dicts、列表和字符串要查看以查找仍然是字节格式的内容。

The way I ended up solving it: the write_to_file.jsonwill have everything up to the bytes object that is causing the issue.

我最终解决它的方式：write_to_file.json将拥有导致问题的字节对象的所有内容。

In my particular case this was a line obtained through

在我的特殊情况下，这是通过获得的一条线

for line in text:
    json_object['line'] = line.strip()

I solved by first finding this error with the help of the write_to_file.json, then by correcting it to:

我首先在 write_to_file.json 的帮助下找到了这个错误，然后将其更正为：

for line in text:
    json_object['line'] = line.strip().decode()

Answer 3

回答by Sreeja Nampoothiri

I guess the answer you need is referenced here Python sets are not json serializable

我想这里引用了您需要的答案 Python 集不是 json 可序列化的

Not all datatypes can be json serialized . I guess pickle module will serve your purpose.

并非所有数据类型都可以进行 json 序列化。我猜pickle模块可以满足您的目的。

Python 类型错误：“字节”类型的对象不是 JSON 可序列化的

提问by Zhibin

回答by Martijn Pieters

回答by Jordan Donovan

回答by Sreeja Nampoothiri

相关推荐

最近更新

标签

Python 类型错误：“字节”类型的对象不是 JSON 可序列化的

提问by Zhibin

回答by Martijn Pieters

回答by Jordan Donovan

回答by Sreeja Nampoothiri

相关推荐

Python PIP 找不到 pywin32（在 Windows 上）

Python AttributeError: 模块“cv2.cv2”没有属性“createLBHFaceRecognizer”

Python 参数 1 具有意外类型“NoneType”？

Python 从 Pandas 聚合中重命名结果列（“FutureWarning：不推荐使用重命名的字典”）

相关推荐

最近更新

标签