Python imaplib 获取正文电子邮件 gmail

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14029768/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:13:25  来源:igfitidea点击:

Python imaplib fetch body emails gmail

pythonemailrequestimapimaplib

提问by kiriloff

I read this alreadyand wrote this script to fetch body for emails in some mail box which title begins with '$' and is sent by some sender.

已经读过这篇文章并编写了这个脚本来获取某个邮箱中标题以“$”开头并由某个发件人发送的电子邮件正文。

import email, getpass, imaplib, os

detach_dir = "F:\PYTHONPROJECTS" # where you will save attachments
user = raw_input("Enter your GMail username --> ")
pwd = getpass.getpass("Enter your password --> ")

# connect to the gmail imap server
m = imaplib.IMAP4_SSL("imap.gmail.com")
m.login(user, pwd)

m.select("PETROLEUM") # here you a can choose a mail box like INBOX instead
# use m.list() to get all the mailboxes

resp, items = m.search(None, '(FROM "[email protected]")')
items = items[0].split() # getting the mails id

my_msg = [] # store relevant msgs here in please
msg_cnt = 0
break_ = False
for emailid in items[::-1]:
    resp, data = m.fetch(emailid, "(RFC822)")
    if ( break_ ):
        break 
    for response_part in data:
      if isinstance(response_part, tuple):
          msg = email.message_from_string(response_part[1])
          varSubject = msg['subject']
          if varSubject[0] == '$':
              msg_cnt += 1
              my_msg.append(msg)
              print msg_cnt
              print email.message_from_string(response_part[1])
              if ( msg_cnt == 5 ):
                  break_ = True 

if I print email.message_from_string(response_part[1]), I can see it contains first information (header, from, to, date...), the the full text body. But, I cannot fetch the body itself. email.message_from_string(response_part[0])prints mails IDS, and email.message_from_string(response_part[2])is out of range. email.message_from_string(response_part[1][0])neither is doing it.

如果我打印email.message_from_string(response_part[1]),我可以看到它包含第一个信息(标题,从,到,日期......),全文正文。但是,我无法取回尸体本身。email.message_from_string(response_part[0])打印邮件 IDS,并且email.message_from_string(response_part[2])超出范围。email.message_from_string(response_part[1][0])两者都没有这样做。

Thanks and regards.

感谢致敬。

UPDATE

更新

Now, I can almost have body text. However, it is still spoilt by an information statement coming first. I get as a result

现在,我几乎可以拥有正文了。然而,它仍然被首先出现的信息声明破坏了。结果我得到

From nobody Tue Dec 25 11:42:58 2012

US=3D.030

EastCst=3D.036

NewEng=3D.205

CenAtl=3D.149

LwrAtl=3D.921

Midwst=3D.984

GulfCst=3D.945

RkyMt=3D.195

WCst=3D.187

CA=3D.268

and I would like to get rid of From nobody Tue Dec 25 11:42:58 2012which is information. I know I could parse text look for first relevant line... i know.

我想摆脱From nobody Tue Dec 25 11:42:58 2012哪些是信息。我知道我可以解析文本查找第一个相关行......我知道。

The code for achieving so (to plug in my first sample) is

实现这一点的代码(插入我的第一个示例)是

  if varSubject[0] == '$':
      r, d = m.fetch(emailid, "(UID BODY[TEXT])")
      msg_cnt += 1
      my_msg.append(msg)
      print email.message_from_string(d[0][1])

Do you have a better way (no info string) ??? More: what is the command to now fetch the date ? I know that I can do varDate = msg['date']where suited above, but how to just fetch day-month-year ? THANKS

你有更好的方法吗(没有信息字符串)???更多:现在获取日期的命令是什么?我知道我可以varDate = msg['date']在上面适合的地方做,但是如何只获取 day-month-year ?谢谢

采纳答案by damzam

You can get the contents of the body by doing any of the following

您可以通过执行以下任一操作来获取正文的内容

msg.as_string()
str(msg)
repr(msg)

http://docs.python.org/2.7/library/email.message.html#email.message.Message

http://docs.python.org/2.7/library/email.message.html#email.message.Message

回答by Edward Chapman

I've managed to get this to work using Gmail, it extracts the useful bits and outputs them to text files:

我设法使用 Gmail 使其工作,它提取有用的位并将它们输出到文本文件:

import datetime
import email
import imaplib
import mailbox


EMAIL_ACCOUNT = "[email protected]"
PASSWORD = "your password"

mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(EMAIL_ACCOUNT, PASSWORD)
mail.list()
mail.select('inbox')
result, data = mail.uid('search', None, "UNSEEN") # (ALL/UNSEEN)
i = len(data[0].split())

for x in range(i):
    latest_email_uid = data[0].split()[x]
    result, email_data = mail.uid('fetch', latest_email_uid, '(RFC822)')
    # result, email_data = conn.store(num,'-FLAGS','\Seen') 
    # this might work to set flag to seen, if it doesn't already
    raw_email = email_data[0][1]
    raw_email_string = raw_email.decode('utf-8')
    email_message = email.message_from_string(raw_email_string)

    # Header Details
    date_tuple = email.utils.parsedate_tz(email_message['Date'])
    if date_tuple:
        local_date = datetime.datetime.fromtimestamp(email.utils.mktime_tz(date_tuple))
        local_message_date = "%s" %(str(local_date.strftime("%a, %d %b %Y %H:%M:%S")))
    email_from = str(email.header.make_header(email.header.decode_header(email_message['From'])))
    email_to = str(email.header.make_header(email.header.decode_header(email_message['To'])))
    subject = str(email.header.make_header(email.header.decode_header(email_message['Subject'])))

    # Body details
    for part in email_message.walk():
        if part.get_content_type() == "text/plain":
            body = part.get_payload(decode=True)
            file_name = "email_" + str(x) + ".txt"
            output_file = open(file_name, 'w')
            output_file.write("From: %s\nTo: %s\nDate: %s\nSubject: %s\n\nBody: \n\n%s" %(email_from, email_to,local_message_date, subject, body.decode('utf-8')))
            output_file.close()
        else:
            continue