Html 文本中显示问号字符,这是为什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/241015/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Question mark characters displaying within text, why is this?
提问by Brad
I have a backup server that automatically backs up my live site, both files and database.
我有一个备份服务器,可以自动备份我的实时站点,包括文件和数据库。
On the live site, the text looks fine, but when you view the mirrored version of it, it displays '?' within some of the text. This text is stored within the news database table.
在实时站点上,文本看起来不错,但是当您查看它的镜像版本时,它显示“?” 在一些文本中。该文本存储在新闻数据库表中。
Here is a screen shot of it being on the live server and of it on the mirrored server.
这是它在实时服务器和镜像服务器上的屏幕截图。
What could happen within the process of backing it up to the mirrored server?
在将其备份到镜像服务器的过程中会发生什么?
采纳答案by IAdapter
The following articles will be useful
以下文章会很有用
http://dev.mysql.com/doc/refman/5.0/en/charset-syntax.html
http://dev.mysql.com/doc/refman/5.0/en/charset-syntax.html
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
After you connect to the database issue the following command:
连接到数据库后,发出以下命令:
SET NAMES 'utf8';
设置名称'utf8';
Ensure that your web page also uses the UTF-8 encoding:
确保您的网页也使用 UTF-8 编码:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
PHP also offers several function that will be useful for conversions:
PHP 还提供了几个对转换有用的函数:
http://us3.php.net/manual/en/function.iconv.php
http://us3.php.net/manual/en/function.iconv.php
回答by Dave Burton
Edit your Apache configuration file on the "mirror" server (the server with the problem), and comment-out the following line:
在“镜像”服务器(有问题的服务器)上编辑 Apache 配置文件,并注释掉以下行:
AddDefaultCharset UTF-8
Then restart Apache:
然后重启Apache:
service httpd restart
The problem is that the "AddDefaultCharset UTF-8" line overrides the Content-Type specified in the .html files; e.g.:
问题是“AddDefaultCharset UTF-8”行覆盖了 .html 文件中指定的 Content-Type;例如:
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
The most common symptom is that character codes above 127 display as black diamonds with question marks on them (in Chrome, Safari or Firefox), or as little boxes (in IE and Opera). HTML files generated by Microsoft Word usually have many such characters, the most common one being character code 160 = 0xA0, which is equivalent to " " in the Windows-1252 encoding, and is often found between span tags, like this:
最常见的症状是 127 以上的字符代码显示为带有问号的黑色菱形(在 Chrome、Safari 或 Firefox 中),或显示为小方框(在 IE 和 Opera 中)。Microsoft Word 生成的 HTML 文件通常有很多这样的字符,最常见的是字符代码 160 = 0xA0,相当于“ ” 在 Windows-1252 编码中,经常出现在 span 标签之间,像这样:
<span style="mso-spacerun: yes">ááá </span>
回答by Leniel Maccaferri
I got here looking for a solution for JavaScript displayed in the browser and although not directly related with a database...
我来到这里寻找浏览器中显示的 JavaScript 的解决方案,尽管与数据库没有直接关系......
In my case I copied and pasted some text I found on the internet into a JavaScript file and saved it with Windows Notepad.
就我而言,我将在 Internet 上找到的一些文本复制并粘贴到 JavaScript 文件中,并使用 Windows 记事本保存。
When the page that uses that JavaScript file output the strings there were question marks (like the ones shown in the question) instead of the special characters like accented letters, etc.
当使用该 JavaScript 文件的页面输出字符串时,会出现问号(如问题中显示的那些)而不是特殊字符,如重音字母等。
I opened the file using Notepad++
. Right after opening the file I saw that the character encoding was set as ANSI
as you can see (mouse cursor on footer) in the following screenshot:
我使用Notepad++
. 打开文件后,我看到字符编码设置ANSI
为您可以在以下屏幕截图中看到(页脚上的鼠标光标):
To solve the issue, click the Encoding
menu in Notepad++
and select Encode in UTF-8
. You should be good to go. :)
要解决此问题,请单击中的Encoding
菜单Notepad++
并选择Encode in UTF-8
。你应该很高兴去。:)
回答by Benjamin Lee
Unicode or other character set characters falling through?
Unicode 或其他字符集字符落下?
I have seen similar "strange" characters show up on sites I have worked on often when the text is copied from an email or some other document format (e.g. word) into a text editor. The editor can display the non ASCII characters but the browser can't. For the website, I would suggest looking up the HTML entity code for the character and inserting that instead ... or switch to more standard ones.
当文本从电子邮件或其他文档格式(例如 word)复制到文本编辑器时,我经常在我工作的网站上看到类似的“奇怪”字符出现。编辑器可以显示非 ASCII 字符,但浏览器不能。对于网站,我建议查找字符的 HTML 实体代码并插入它......或者切换到更标准的代码。
回答by JamShady
Your browser hasn't interpretted the encoding of the page correctly (either because you've forced it to a particular setting, or the page is set incorrectly), and thus cannot display some of the characters.
您的浏览器没有正确解释页面的编码(要么是因为您已将其强制为特定设置,要么是页面设置不正确),因此无法显示某些字符。
回答by toolkit
This is going to be something to do with character encodings.
这将与字符编码有关。
Are you sure the mirrored site has the same properties with regards to character encodings as your main server?
您确定镜像站点在字符编码方面与您的主服务器具有相同的属性吗?
Depending on what sort of server you have, this may be a property of the server process itself, or it could be an environment variable.
根据您拥有的服务器类型,这可能是服务器进程本身的属性,也可能是环境变量。
For example, if this is a UNIX environment, perhaps try comparing LANG or LC_ALL?
例如,如果这是一个 UNIX 环境,也许尝试比较 LANG 或 LC_ALL?
See also here
另见此处
回答by ola.rogula
I had this issue so I just took all my content, copy/pasted it into notepad, made a new php file, pasted back in, re-saved and overwrote, and.. that worked! It really was some relic of Microsoft Word editing...
我遇到了这个问题,所以我只是把我的所有内容,复制/粘贴到记事本中,制作一个新的 php 文件,重新粘贴,重新保存并覆盖,然后..有效!这真的是微软 Word 编辑的一些遗物......
回答by John Rudy
Check the character set being emitted by your mirrored server. There appears to be a difference from that to the main server -- the live site appears to be outputting Unicode, where the mirror is not. Also, it's usually a good idea to scrub Unicode characters in your incoming content and replace them with their appropriate HTML entities.
检查镜像服务器发出的字符集。似乎与主服务器有所不同——实时站点似乎正在输出 Unicode,而镜像则没有。此外,在传入内容中删除 Unicode 字符并将其替换为相应的 HTML 实体通常是一个好主意。
Your specific issue regards "smart quotes," "em dashes" and "en dashes." I know you can replace em dashes with —
and n-dashes with –
(which should be done on the input side of your database); I don't know what the correct replacement for the smart quotes would be. (I usually just replace all curly single quotes with ' and all curly double quotes with " ... Typography geeks may feel free to shoot me on sight.)
您的具体问题与“智能引号”、“em dashes”和“en dashes”有关。我知道你可以用 em dashes—
和 n-dashes 替换–
(这应该在数据库的输入端完成);我不知道智能引号的正确替代品是什么。(我通常只是用 ' 替换所有卷曲单引号,用 " 替换所有卷曲双引号......排版极客可能会随时向我开枪。)
I should note that some browsers are more forgiving than others with this issue -- Internet Explorer on Windows tends to auto-magically detect and "fix" this; Firefox and most other browsers display the question marks.
我应该注意到,对于这个问题,有些浏览器比其他浏览器更宽容——Windows 上的 Internet Explorer 往往会自动神奇地检测并“修复”这个问题;Firefox 和大多数其他浏览器显示问号。
回答by Nick Van Brunt
I usually curse MS word and then run the following Wscript.
我通常诅咒MS word,然后运行以下Wscript。
// replace with path to a file that needs cleaning
PATH = "test.html"
// 替换为需要清理的文件的
路径 PATH = "test.html"
var go=WScript.CreateObject("Scripting.FileSystemObject");
var content=go.GetFile(PATH).OpenAsTextStream().ReadAll();
var out=go.CreateTextFile("clean-"+PATH, true);
// symbols
content=content.replace(/“/g,'"');
content=content.replace(/”/g,'"');
content=content.replace(/'/g,"'");
content=content.replace(/–/g,"-");
content=content.replace(/?/g,"©");
content=content.replace(/?/g,"®");
content=content.replace(/°/g,"°");
content=content.replace(/?/g,"<p>");
content=content.replace(/?/g,"¿");
content=content.replace(/?/g,'¡');
content=content.replace(/¢/g,'¢');
content=content.replace(/£/g,'£');
content=content.replace(/¥/g,'¥');
out.Write(content);
var go=WScript.CreateObject("Scripting.FileSystemObject");
var content=go.GetFile(PATH).OpenAsTextStream().ReadAll();
var out=go.CreateTextFile("clean-"+PATH, true);
// 符号
content=content.replace(/“/g,'"');
content=content.replace(/”/g,'"');
content=content.replace(/'/g,"'");
content=content.replace(/–/g,"-");
content=content.replace(/?/g,"©");
content=content.replace(/?/g,"®");
content=content.replace(/°/g,"°");
content=content.replace(/?/g,"<p>");
content=content.replace(/?/g,"¿");
content=content.replace(/?/g,'¡ ');
content=content.replace(/¢/g,'¢');
content=content.replace(/£/g,'£');
content=content.replace(/¥/g,'¥');
out.Write(内容);