在 PHP 中使用 BOM 将字符串编码为 UTF-8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5601904/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 21:56:52  来源:igfitidea点击:

Encoding a string as UTF-8 with BOM in PHP

phputf-8byte-order-mark

提问by Jeano

how can I force PHP to add the BOM when using utf8_encode ?

使用 utf8_encode 时如何强制 PHP 添加 BOM?

Here's what I am trying to do:

这是我想要做的:

$zip->addFromString($filename, utf8_encode($xml));

Unfortunately (for me), the result will not have the BOM mark at the beginning.

不幸的是(对我而言),结果一开始不会有 BOM 标记。

回答by Charles

Have you tried adding one yourself?

您是否尝试过自己添加一个?

The UTF-8 BOMseems to be 0xEF 0xBB 0xBF, so you can attach it to your string afterconversion to UTF-8.

UTF-8 BOM似乎是0xEF 0xBB 0xBF,这样你就可以将其连接到您的字符串转换为UTF-8。

$utf8_with_bom = chr(239) . chr(187) . chr(191) . $utf8_string;

Watch out, though. utf8_encodewants an ISO-8859-1 string. If you're working with XML, make sure that the XML isn't alreadyUTF-8 encoded. The comments on the documentation suggest that the function is broken in a variety of fun ways, so you shouldn't throw it around unless you knowthat you need it.

不过要小心。 utf8_encode想要一个 ISO-8859-1 字符串。如果您正在使用 XML,请确保 XML尚未采用UTF-8 编码。文档上的评论表明该函数以多种有趣的方式被破坏,因此除非您知道需要它,否则不应将其丢弃。

Remember, PHP strings are simply dumb, unknowing bytes. They don't have a character set attached to them, so if the data in the string is already UTF-8, you don't need to run the conversion.

请记住,PHP 字符串只是愚蠢的、未知的字节。它们没有附加字符集,因此如果字符串中的数据已经是 UTF-8,则不需要运行转换。

Also, the linked Wikipedia article says this:

此外,链接的维基百科文章是这样说的:

While Unicode standard allows BOM in UTF-8, it does not require or recommend it. Byte order has no meaning in UTF-8so a BOM only serves to identify a text stream or file as UTF-8 or that it was converted from another format that has a BOM.

虽然 Unicode 标准允许使用 UTF-8 格式的 BOM,但它并不要求或推荐它。字节顺序在 UTF-8 中没有意义,因此 BOM 仅用于将文本流或文件标识为 UTF-8,或者它是从具有 BOM 的另一种格式转换而来的。

You probably don't need to bother with the BOM tapdance to begin with.

一开始,您可能不需要理会 BOM tapdance。