php 怎么去掉???从文件的开头?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3255993/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 09:08:54  来源:igfitidea点击:

How do I remove ??? from the beginning of a file?

phputf-8character-encodingbyte-order-markmojibake

提问by Matt

I have a CSS file that looks fine when I open it using gedit, but when it's read by PHP (to merge all the CSS files into one), this CSS has the following characters prepended to it: ???

我有一个 CSS 文件,当我使用gedit打开它时看起来不错,但是当它被 PHP 读取时(将所有 CSS 文件合并为一个),这个 CSS 前面有以下字符:???

PHP removes all whitespace, so a random ??? in the middle of the code messes up the entire thing. As I mentioned, I can't actually see these characters when I open the file in gedit, so I can't remove them very easily.

PHP 删除所有空格,所以随机 ??? 在代码中间搞砸了整个事情。正如我提到的,当我在 gedit 中打开文件时,我实际上看不到这些字符,所以我不能很容易地删除它们。

I googled the problem, and there is clearly something wrong with the file encoding, which makes sense being as I've been shifting the files around to different Linux/Windows servers via ftp and rsync, with a range of text editors. I don't really know much about character encoding though, so help would be appreciated.

我用谷歌搜索了这个问题,文件编码显然有问题,这是有道理的,因为我一直在通过 ftp 和rsync将文件转移到不同的 Linux/Windows 服务器,并使用一系列文本编辑器。不过,我对字符编码知之甚少,因此将不胜感激。

If it helps, the file is being saved in UTF-8 format, and gedit won't let me save it in ISO-8859-15 format (the document contains one or more characters that cannot be encoded using the specified character encoding). I tried saving it with Windows and Linux line endings, but neither helped.

如果有帮助,文件以 UTF-8 格式保存,gedit 不会让我以 ISO-8859-15 格式保存它(文档包含一个或多个无法使用指定字符编码进行编码的字符)。我尝试使用 Windows 和 Linux 行结尾保存它,但都没有帮助。

回答by Vinko Vrsalovic

Three words for you:

给你三个字:

Byte Order Mark (BOM)

字节顺序标记 (BOM)

That's the representation for the UTF-8 BOM in ISO-8859-1. You have to tell your editor to not use BOMs or use a different editor to strip them out.

这是 ISO-8859-1 中 UTF-8 BOM 的表示。您必须告诉您的编辑器不要使用 BOM 表或使用不同的编辑器将它们删除。

To automatize the BOM's removal you can use awkas shown in this question.

要自动移除 BOM,您可以使用本问题中awk所示的方法。

As another answer says, the best would be for PHP to actually interpret the BOM correctly, for that you can use mb_internal_encoding(), like this:

正如另一个答案所说,最好是让 PHP 实际正确解释 BOM,为此您可以使用mb_internal_encoding(),如下所示:

 <?php
   //Storing the previous encoding in case you have some other piece 
   //of code sensitive to encoding and counting on the default value.      
   $previous_encoding = mb_internal_encoding();

   //Set the encoding to UTF-8, so when reading files it ignores the BOM       
   mb_internal_encoding('UTF-8');

   //Process the CSS files...

   //Finally, return to the previous encoding
   mb_internal_encoding($previous_encoding);

   //Rest of the code...
  ?>

回答by Michael Schreiber

In PHP, you can do the following to remove all non characters including the character in question.

PHP 中,您可以执行以下操作来删除所有非字符,包括相关字符。

$response = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $response);

回答by V.Rohan

Open your file in Notepad++. From the Encodingmenu, select Convert to UTF-8 without BOM, save the file, replace the old file with this new file. And it will work, damn sure.

Notepad++ 中打开您的文件。从编码菜单中,选择Convert to UTF-8 without BOM,保存文件,用这个新文件替换旧文件。它会起作用,该死的肯定。

回答by Diego Palomar

For those with shell access here is a little command to find all files with the BOM set in the public_html directory - be sure to change it to what your correct path on your server is

对于那些具有 shell 访问权限的人来说,这里有一个小命令来查找在 public_html 目录中设置了 BOM 的所有文件 - 请务必将其更改为您服务器上的正确路径

Code:

代码:

grep -rl $'\xEF\xBB\xBF' /home/username/public_html

and if you are comfortable with the vieditor, open the file in vi:

如果您对vi编辑器感到满意,请在vi 中打开该文件:

vi /path-to-file-name/file.php

And enter the command to remove the BOM:

并输入删除BOM的命令:

set nobomb

Save the file:

保存文件:

wq

回答by Eugene Yokota

BOM is just a sequence of characters ($EF $BB $BF for UTF-8), so just remove them using scripts or configure the editor so it's not added.

BOM 只是一个字符序列(UTF-8 为 $EF $BB $BF),因此只需使用脚本删除它们或配置编辑器,使其不被添加。

From Removing BOM from UTF-8:

UTF-8 中删除 BOM

#!/usr/bin/perl
@file=<>;
$file[0] =~ s/^\xEF\xBB\xBF//;
print(@file);

I am sure it translates to PHP easily.

我相信它很容易转换为 PHP。

回答by NickWebman

For me, this worked:

对我来说,这有效:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

If I remove this meta, the ??? appears again. Hope this helps someone...

如果我删除这个元,???再次出现。希望这可以帮助某人...

回答by Jeffrey L Whitledge

I don't know PHP, so I don't know if this is possible, but the best solution would be to read the file as UTF-8 rather than some other encoding. The BOM is actually a ZERO WIDTH NO BREAK SPACE. This is whitespace, so if the file were being read in the correct encoding (UTF-8), then the BOM would be interpreted as whitespace and it would be ignored in the resulting CSS file.

我不知道 PHP,所以我不知道这是否可能,但最好的解决方案是将文件读取为 UTF-8 而不是其他一些编码。BOM 实际上是一个零宽度的无间断空间。这是空格,因此如果以正确的编码 (UTF-8) 读取文件,则 BOM 将被解释为空格,并且在生成的 CSS 文件中将被忽略。

Also, another advantage of reading the file in the correct encoding is that you don't have to worry about characters being misinterpreted. Your editor is telling you that the code page you want to save it in won't do all the characters that you need. If PHP is then reading the file in the incorrect encoding, then it is very likely that other characters besides the BOM are being silently misinterpreted. Use UTF-8 everywhere, and these problems disappear.

此外,以正确编码读取文件的另一个优点是您不必担心字符被误解。您的编辑器告诉您,您想要保存它的代码页无法处理您需要的所有字符。如果 PHP 正在以不正确的编码读取文件,那么很可能除了 BOM 之外的其他字符都被默默地误解了。到处使用UTF-8,这些问题就会消失。

回答by till

You can use

您可以使用

vim -e -c 'argdo set fileencoding=utf-8|set encoding=utf-8| set nobomb| wq'

Replacing with awk seems to work, but it is not in place.

用 awk 替换似乎可行,但没有到位。

回答by Toby

I had the same problem with the BOM appearing in some of my PHP files (??????).

我的某些 PHP 文件中出现 BOM 时遇到了同样的问题(??????)。

If you use PhpStormyou can set at hotkey to remove it in Settings -> IDE Settings -> Keymap -> Main Menu - > File -> Remove BOM.

如果您使用PhpStorm,您可以在设置 -> IDE 设置 -> 键盘映射 -> 主菜单 -> 文件 -> 删除 BOM 中设置热键以将其删除。

回答by Simone

grep -rl $'\xEF\xBB\xBF' * | xargs vim -e -c 'argdo set fileencoding=utf-8|set encoding=utf-8| set nobomb| wq'

grep -rl $'\xEF\xBB\xBF' * | xargs vim -e -c 'argdo set fileencoding=utf-8|set encoding=utf-8| 设置炸弹| wq'