C# 日文邮件主题编码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/419977/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
japanese email subject encoding
提问by danijels
Aparently, encoding japanese emails is somewhat challenging, which I am slowly discovering myself. In case there are any experts (even those with limited experience will do), can I please have some guidelines as to how to do it, how to test it and how to verify it?
显然,对日语电子邮件进行编码有点挑战性,我自己也在慢慢发现这一点。如果有任何专家(即使是那些经验有限的人也会这样做),我能否提供一些关于如何做、如何测试和如何验证的指导?
Bear in mind that I've never set foot anywhere near Japan, it is simply that the product I'm developing is used there, among other places.
请记住,我从未涉足过日本附近的任何地方,只是我正在开发的产品在那里等地使用。
What (I think) I know so far is following:
- Japanese emails should be encoded in ISO-2022-JP, Japanese JIS codepage 50220 or possibly SHIFT_JIS codepage 932
- Email transfer encoding should be set to Base64 for plain text and 7Bit for Html
- Email subject should be encoded separately to start with "=?ISO-2022-JP?B?" (don't know what this is supposed to mean). I've tried encoding the subject with
到目前为止,我所知道的(我认为)如下:
- 日语电子邮件应使用 ISO-2022-JP、日语 JIS 代码页 50220 或可能的 SHIFT_JIS 代码页 932 进行
编码 - 电子邮件传输编码应设置为 Base64(纯文本)和 7Bit(Html)
- 电子邮件主题应单独编码,以“=?ISO-2022-JP?B?”开头 (不知道这是什么意思)。我试过用
"=?ISO-2022-JP?B?" + Convert.ToBase64String(Encoding.Unicode.GetBytes(subject))
which basically gives the encoded string as expected but it doesn't get presented as any japanese text in an email program
- I've tested in Outlook 2003, Outlook Express and GMail
它基本上按预期提供了编码字符串,但它不会在电子邮件程序中显示为任何日语文本
- 我已经在 Outlook 2003、Outlook Express 和 GMail 中进行了测试
Any help would be greatly appreciated
任何帮助将不胜感激
Ok, so to post a short update, thanks to the two helpful answers, I've managed to get the right format and encoding. Now, Outlook gives something that resembles the correct subject:=?iso-2022-jp?B?6 Japanese test に各々の視点で語ってもらった。 6相当の防水?=
好的,所以要发布一个简短的更新,感谢两个有用的答案,我设法获得了正确的格式和编码。现在,Outlook 给出了类似于正确主题的内容:=?iso-2022-jp?B?6 Japanese test に各々の視点で語ってもらった。 6相当の防水?=
However, the exact same email in Outlook Express gives subject like this:=?iso-2022-jp?B?6 Japanese test 縺?蜷???隕也せ縺?隱槭▲縺?繧ゅi縺?縺溘? 6逶?蠖薙?髦?豌??=
但是,Outlook Express 中完全相同的电子邮件给出了这样的主题:=?iso-2022-jp?B?6 Japanese test 縺?蜷???隕也せ縺?隱槭▲縺?繧ゅi縺?縺溘? 6逶?蠖薙?髦?豌??=
Furthermore, when viewed in the Inbox view in Outlook Express, the email subject is even more weird, like this:=?iso-2022-jp?B?6 Japanese test ??????????????? 6???????=
此外,在 Outlook Express 的收件箱视图中查看时,电子邮件主题更加奇怪,如下所示:=?iso-2022-jp?B?6 Japanese test ??????????????? 6???????=
Gmail seems to be working in the similar fashion to Outlook, which looks correct.
Gmail 似乎以与 Outlook 类似的方式工作,这看起来是正确的。
I just can't get my head around this one.
我就是无法理解这一点。
采纳答案by 保田ジェフリー
I've been dealing with Japanese encodings for almost 20 years and so I can sympathize with your difficulties. Websites that I've worked on send hundreds of emails daily to Japanese customers so I can share with you what's worked for us.
我已经处理了近 20 年的日语编码,所以我可以理解你的困难。我工作过的网站每天向日本客户发送数百封电子邮件,以便我可以与您分享对我们有用的内容。
First of all, do not use Shift-JIS. I personally receive tons of Japanese emails and almost never are they encoded using Shift-JIS. I think an old (circa Win 98?) version of Outlook Express encoded outgoing mail using Shift-JIS, but nowadays you just don't see it.
As you've figured out, you need to use ISO-2022-JP as your encoding for at least anything that goes in the mail header. This includes the Subject, To line, and CC line. UTF-8 will also work in most cases, butit will not work on Yahoo Japan mail, and as you can imagine, many Japanese users use Yahoo Japan mail.
You can use UTF-8 in the body of the email, but it is recommended that you base64 encode the UTF-8 encoded Japanese text and put that in the body instead of raw UTF-8 text. However, in practice, I believe that raw UTF-8 text will work fine these days, for the body of the email.
As I alluded to above, you need to at least test on Outlook (Exchange), Outlook Express (IMAP/POP3), and Yahoo Japan web mail. Yahoo Japan is the trickiest because I believe they use EUC for the encoding of their web pages, and so you need to follow the correct standards for your emails or they won't work (ISO-2022-JP is the standard for sending Japanese emails).
Also, your subject line should not exceed 75 characters per line. That is, 75 characters afteryou've encoded in ISO-2022-JP and base64, not 75 characters before conversion. If you exceed 75 characters, you need to break your encoded subject into multiple lines, starting with "=?iso-2022-jp?B?" and ending with "?=" on each line. If you don't do this, your subject might get truncated (depending on the email reader, and also the content of your subject text). According to RFC 2047:
首先,不要使用Shift-JIS。我个人收到了大量的日语电子邮件,而且几乎从未使用 Shift-JIS 进行编码。我认为旧的(大约是 Win 98?)版本的 Outlook Express 使用 Shift-JIS 编码外发邮件,但现在你只是看不到它。
正如您所发现的,您至少需要使用 ISO-2022-JP 作为邮件头中任何内容的编码。这包括主题、收件人行和抄送行。UTF-8 也适用于大多数情况,但它不适用于 Yahoo Japan 邮件,并且您可以想象,许多日本用户使用 Yahoo Japan 邮件。
您可以在电子邮件正文中使用 UTF-8,但建议您对 UTF-8 编码的日语文本进行 base64 编码,并将其放入正文而不是原始 UTF-8 文本。但是,在实践中,我相信原始 UTF-8 文本现在可以很好地用于电子邮件正文。
正如我上面提到的,您至少需要在 Outlook (Exchange)、Outlook Express (IMAP/POP3) 和 Yahoo Japan 网络邮件上进行测试。雅虎日本是最棘手的,因为我相信他们使用 EUC 对其网页进行编码,因此您需要遵循正确的电子邮件标准,否则它们将无法工作(ISO-2022-JP 是发送日语电子邮件的标准) )。
此外,您的主题行每行不应超过 75 个字符。也就是说,用 ISO-2022-JP 和 base64 编码后的75 个字符,而不是转换前的 75 个字符。如果超过 75 个字符,则需要将编码主题分成多行,以“=?iso-2022-jp?B?”开头。每行以“?=”结尾。如果您不这样做,您的主题可能会被截断(取决于电子邮件阅读器以及主题文本的内容)。根据 RFC 2047:
"An 'encoded-word' may not be more than 75 characters long, including 'charset', 'encoding', 'encoded-text', and delimiters. If it is desirable to encode more text than will fit in an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be used."
“一个‘编码字’的长度不能超过 75 个字符,包括‘字符集’、‘编码’、‘编码文本’和分隔符。如果需要编码比‘编码- word' 的 75 个字符,可以使用多个 'encoded-word'(由 CRLF SPACE 分隔)。”
- Here's some sample PHP code to encode the subject:
- 下面是一些用于对主题进行编码的示例 PHP 代码:
// Convert Japanese subject to ISO-2022-JP (JIS is essentially ISO-2022-JP)
$subject = mb_convert_encoding ($subject, "JIS", "SJIS");
// Now, base64 encode the subject
$subject = base64_encode ($subject);
// Add the encoding markers to the subject
$subject = "=?iso-2022-jp?B?" . $subject . "?=";
// Now, $subject can be placed as-is into the raw mail header.
- See RFC 2047 for a complete description of how to encode your email header.
- 有关如何对电子邮件标头进行编码的完整说明,请参阅 RFC 2047。
回答by Bombe
Check http://en.wikipedia.org/wiki/MIME#Encoded-Wordfor a description on how to encode header fields in MIME-compliant messages. You seem to be missing a “?=” at the end of your subject.
查看http://en.wikipedia.org/wiki/MIME#Encoded-Word以获取有关如何对符合 MIME 的消息中的标头字段进行编码的说明。您似乎在主题末尾缺少“?=”。
回答by dmajkic
=?ISO-2022-JP?B?TEXTTEXT...
=?ISO-2022-JP?B?TEXTTEXT...
ISO_2022-JP means that string is encoded in ISO-2022-JP codepage (eg. not Unicode) B means that string is bese64 encoded
ISO_2022-JP 表示字符串在 ISO-2022-JP 代码页中编码(例如,不是 Unicode) B 表示字符串是 bese64 编码的
In your example, you should just supply your string in ISO-2022-JP instead of Unicode.
在您的示例中,您应该只以 ISO-2022-JP 而不是 Unicode 格式提供字符串。
回答by liggett78
First of all you should be using:
首先你应该使用:
Encoding.GetEncoding("ISO-2022-JP")
Encoding.GetEncoding("ISO-2022-JP")
to convert your subject line into bytes that will be processed by Convert.ToBase64String().
将您的主题行转换为将由 Convert.ToBase64String() 处理的字节。
=?ISO-2022-JP?B?TEXTTEXT...?= tells the receiving mail client which encoding was used on the sender's side to convert japanese "letters" into a byte stream.
=?ISO-2022-JP?B?TEXTTEXT...?= 告诉接收邮件客户端在发送方使用哪种编码将日语“字母”转换为字节流。
Currently you're using UTF-16 to encode, but specifying ISO-2022-JP to decode. These are obviously two different encodings, I guess, just like ISO-8859-1 is different from Unicode (most extended western-europe chars are represented by one byte in ISO-XXX, but two bytes in Unicode).
目前您使用 UTF-16 进行编码,但指定 ISO-2022-JP 进行解码。我猜这显然是两种不同的编码,就像 ISO-8859-1 与 Unicode 不同(大多数扩展的西欧字符在 ISO-XXX 中由一个字节表示,而在 Unicode 中由两个字节表示)。
I'm not sure what you mean about UTF-8 being second-class citizen. As long as the receiving mail client understands UTF-8 and is able to convert it to the current japanese locale, everything is fine.
我不确定你说 UTF-8 是二等公民是什么意思。只要接收邮件客户端理解 UTF-8 并且能够将其转换为当前的日语语言环境,一切都很好。
回答by liggett78
I have some experience composing and sending email in japanese...Normally you have to beware what encoding used for operating system and how you store your japanese strings! My Mail objects are normally encoded as follows:
我有一些用日语撰写和发送电子邮件的经验...通常您必须注意操作系统使用的编码以及您如何存储日语字符串!我的邮件对象通常编码如下:
string s = "V?μ?¢?wK–@?ì?2'???"; // Our japanese are shift-jis encoded, so it appears like garbled
MailMessage message = new MailMessage();
message.BodyEncoding = Encoding.GetEncoding("iso-2022-jp");
message.SubjectEncoding = Encoding.GetEncoding("iso-2022-jp");
message.Subject = s.ToEncoding(Encoding.GetEncoding("Shift-Jis")); // Change the encoding to whatever your source is
message.Body = s.ToEncoding(Encoding.GetEncoding("Shift-Jis")); // Change the encoding to whatever your source is
Then i have an extension method to which does the conversion for me:
然后我有一个扩展方法可以为我进行转换:
public static string ToEncoding(this string s, Encoding targetEncoding)
{
return s == null ? null : targetEncoding.GetString(Encoding.GetEncoding(1252).GetBytes(s)); //1252 is the windows OS codepage
}
回答by si28719e
something like this should get the job done in python:
这样的事情应该可以在 python 中完成工作:
#!/usr/bin/python
# -*- mode: python; coding: utf-8 -*-
import smtplib
from email.MIMEText import MIMEText
from email.Header import Header
from email.Utils import formatdate
def send_from_gmail( from_addr, to_addr, subject, body, password, encoding="iso-2022-jp" ):
msg = MIMEText(body.encode(encoding), 'plain', encoding)
msg['Subject'] = Header(subject.encode(encoding), encoding)
msg['From'] = from_addr
msg['To'] = to_addr
msg['Date'] = formatdate()
s = smtplib.SMTP('smtp.gmail.com', 587)
s.ehlo(); s.starttls(); s.ehlo()
s.login(from_addr, password)
s.sendmail(from_addr, to_addr, msg.as_string())
s.close()
return "Sent mail to: %s" % to_addr
if __name__ == "__main__":
import sys
for n,item in enumerate(sys.argv):
sys.argv[n] = sys.argv[n].decode("utf8")
if len(sys.argv)==6:
print send_from_gmail( sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4], sys.argv[5] )
elif len(sys.argv)==7:
print send_from_gmail( sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4], sys.argv[5], encoding=sys.argv[6] )
else:
raise "SYNTAX: %s <from_addr> <to_addr> <subject> <body> <password> [encoding]"
**blatantly stolen/adapted from:
**公然窃取/改编自:
回答by avanish
<?php
function sendMail($to, $subject, $body, $from_email,$from_name)
{
$headers = "MIME-Version: 1.0 \n" ;
$headers .= "From: " .
"".mb_encode_mimeheader (mb_convert_encoding($from_name,"ISO-2022-JP","AUTO")) ."" .
"<".$from_email."> \n";
$headers .= "Reply-To: " .
"".mb_encode_mimeheader (mb_convert_encoding($from_name,"ISO-2022-JP","AUTO")) ."" .
"<".$from_email."> \n";
$headers .= "Content-Type: text/plain;charset=ISO-2022-JP \n";
/* Convert body to same encoding as stated
in Content-Type header above */
$body = mb_convert_encoding($body, "ISO-2022-JP","AUTO");
/* Mail, optional parameters. */
$sendmail_params = "-f$from_email";
mb_language("ja");
$subject = mb_convert_encoding($subject, "ISO-2022-JP","AUTO");
$subject = mb_encode_mimeheader($subject);
$result = mail($to, $subject, $body, $headers, $sendmail_params);
return $result;
}
回答by kmugitani
Introduction of Japanese encoding to e-mail happened at JUNET(UUCP based nation-wide network) in early 90's.
在 90 年代初的 JUNET(基于 UUCP 的全国性网络)中将日语编码引入电子邮件。
At that time, RFC1468 was defined. If you follow RFC1468 in plain text mail, there would be no problem.
当时定义了RFC1468。如果您在纯文本邮件中遵循 RFC1468,则没有问题。
If you want to handle html mail, RFC1468 is useless except for header parts.
如果你想处理 html 邮件,RFC1468 除了头部分是没用的。
回答by Jahmic
Here's what I use to send Japanese emails. Subject line looks fine in Outlook 2010, gmail and on iPhone.
这是我用来发送日语电子邮件的内容。主题行在 Outlook 2010、gmail 和 iPhone 上看起来不错。
Encoding encoding = Encoding.GetEncoding("iso-2022-jp");
byte[] bytes = encoding.GetBytes(subject);
string uuEncoded = Convert.ToBase64String(bytes);
subject = "=?iso-2022-jp?B?" + uuEncoded + "?=";
// not sure this is actually necessary...
mailMessage.SubjectEncoding = Encoding.GetEncoding("iso-2022-jp");