在 PHP 中使用 BOM 将字符串编码为 UTF-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5601904/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Encoding a string as UTF-8 with BOM in PHP
提问by Jeano
how can I force PHP to add the BOM when using utf8_encode ?
使用 utf8_encode 时如何强制 PHP 添加 BOM?
Here's what I am trying to do:
这是我想要做的:
$zip->addFromString($filename, utf8_encode($xml));
Unfortunately (for me), the result will not have the BOM mark at the beginning.
不幸的是(对我而言),结果一开始不会有 BOM 标记。
回答by Charles
Have you tried adding one yourself?
您是否尝试过自己添加一个?
The UTF-8 BOMseems to be 0xEF 0xBB 0xBF
, so you can attach it to your string afterconversion to UTF-8.
在UTF-8 BOM似乎是0xEF 0xBB 0xBF
,这样你就可以将其连接到您的字符串后转换为UTF-8。
$utf8_with_bom = chr(239) . chr(187) . chr(191) . $utf8_string;
Watch out, though. utf8_encode
wants an ISO-8859-1 string. If you're working with XML, make sure that the XML isn't alreadyUTF-8 encoded. The comments on the documentation suggest that the function is broken in a variety of fun ways, so you shouldn't throw it around unless you knowthat you need it.
不过要小心。 utf8_encode
想要一个 ISO-8859-1 字符串。如果您正在使用 XML,请确保 XML尚未采用UTF-8 编码。文档上的评论表明该函数以多种有趣的方式被破坏,因此除非您知道需要它,否则不应将其丢弃。
Remember, PHP strings are simply dumb, unknowing bytes. They don't have a character set attached to them, so if the data in the string is already UTF-8, you don't need to run the conversion.
请记住,PHP 字符串只是愚蠢的、未知的字节。它们没有附加字符集,因此如果字符串中的数据已经是 UTF-8,则不需要运行转换。
Also, the linked Wikipedia article says this:
此外,链接的维基百科文章是这样说的:
While Unicode standard allows BOM in UTF-8, it does not require or recommend it. Byte order has no meaning in UTF-8so a BOM only serves to identify a text stream or file as UTF-8 or that it was converted from another format that has a BOM.
虽然 Unicode 标准允许使用 UTF-8 格式的 BOM,但它并不要求或推荐它。字节顺序在 UTF-8 中没有意义,因此 BOM 仅用于将文本流或文件标识为 UTF-8,或者它是从具有 BOM 的另一种格式转换而来的。
You probably don't need to bother with the BOM tapdance to begin with.
一开始,您可能不需要理会 BOM tapdance。