python 电子邮件正文有时是字符串,有时是列表。为什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/594545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Email body is a string sometimes and a list sometimes. Why?
提问by None-da
My application is written in python. What I am doing is I am running a script on each email received by postfix and do something with the email content. Procmail is responsible for running the script taking the email as input. The problem started when I was converting the input message(may be text) to email_message object(because the latter comes in handy). I am using email.message_from_string (where email is the default email module, comes with python).
我的应用程序是用 python 编写的。我正在做的是我在 postfix 收到的每封电子邮件上运行一个脚本并对电子邮件内容做一些事情。Procmail 负责运行以电子邮件作为输入的脚本。当我将输入消息(可能是文本)转换为 email_message 对象(因为后者派上用场)时,问题就开始了。我正在使用 email.message_from_string (其中 email 是默认的电子邮件模块,带有 python)。
import email
message = email.message_from_string(original_mail_content)
message_body = message.get_payload()
import email
message = email.message_from_string(original_mail_content)
message_body = message.get_payload()
This message_body is sometimes returning a list[email.message.Message instance,email.message.Message instance] and sometime returning a string(actual body content of the incoming email). Why is it. And even I found one more observation. When I was browsing through the email.message.Message.get_payload() docstring, I found this..
"""
The payload will either be a list object or a string.If you mutate
the list object, you modify the message's payload in place....."""
这个 message_body 有时返回一个列表[email.message.Message instance,email.message.Message instance] 有时返回一个字符串(传入电子邮件的实际正文内容)。为什么。甚至我还发现了一个观察结果。当我浏览 email.message.Message.get_payload() 文档字符串时,我发现了这个..
""" 有效负载将是一个列表对象或一个字符串。如果你改变列表对象,你修改消息的有效负载地方.....”””
So how do I have generic method to get the body of email through python? Please help me out.
那么我如何拥有通过python获取电子邮件正文的通用方法?请帮帮我。
回答by Ali Afshar
Well, the answers are correct, you should read the docs, but for an example of a generic way:
好吧,答案是正确的,您应该阅读文档,但是对于通用方法的示例:
def get_first_text_part(msg):
maintype = msg.get_content_maintype()
if maintype == 'multipart':
for part in msg.get_payload():
if part.get_content_maintype() == 'text':
return part.get_payload()
elif maintype == 'text':
return msg.get_payload()
This is prone to some disaster, as it is conceivable the parts themselves might have multiparts, and it really only returns the first text part, so this might be wrong too, but you can play with it.
这很容易发生一些灾难,因为可以想象部分本身可能有多个部分,并且它实际上只返回第一个文本部分,所以这也可能是错误的,但你可以玩它。
回答by timbo
Rather than simply looking for a sub-part, use walk() to iterate through the message contents
不是简单地寻找子部分,而是使用 walk() 遍历消息内容
def walkMsg(msg):
for part in msg.walk():
if part.get_content_type() == "multipart/alternative":
continue
yield part.get_payload(decode=1)
The walk() method returns an iterator that you can loop with (i.e. it's a generator). If the message is not a container of parts (i.e. has no attachments or alternates), the walk() method will then return an iterator with a single element - the message itself.
walk() 方法返回一个可以循环使用的迭代器(即它是一个生成器)。如果消息不是部件的容器(即没有附件或替代),则 walk() 方法将返回一个带有单个元素的迭代器 - 消息本身。
You want to skip any 'multipart' parts as they are just glue.
您想跳过任何“多部分”部分,因为它们只是胶水。
The above method returns all readable parts. You may want to expand this to simply return the text parts if they contain the info you are seeking.
上述方法返回所有可读部分。如果文本部分包含您正在寻找的信息,您可能希望将其扩展为简单地返回文本部分。
Note that as of Python 2.5, methods get_type(), get_main_type(), and get_subtype() have been removed -> http://docs.python.org/library/email.message.html#email.message.Message.walk
请注意,从 Python 2.5 开始,方法 get_type()、get_main_type() 和 get_subtype() 已被删除 -> http://docs.python.org/library/email.message.html#email.message.Message.walk
回答by unwind
As crazy as it might seem, the reason for the sometimes string, sometimes list-semantics are given in the documentation. Basically, multipart messages are returned as lists.