php 如何以UTF-8格式写入文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4839402/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 14:31:58  来源:igfitidea点击:

How to write file in UTF-8 format?

phpencodingutf-8iconvmbstring

提问by Starmaster

I have bunch of files that are not in UTF-8 encoding and I'm converting a site to UTF-8 encoding.

我有一堆不是 UTF-8 编码的文件,我正在将一个站点转换为 UTF-8 编码。

I'm using simple script for files that I want to save in utf-8, but the files are saved in old encoding:

我对要以 utf-8 格式保存的文件使用简单脚本,但这些文件以旧编码保存:

header('Content-type: text/html; charset=utf-8');
mb_internal_encoding('UTF-8');
$fpath="folder";
$d=dir($fpath);
while (False !== ($a = $d->read()))
 {

 if ($a != '.' and $a != '..')
  {

  $npath=$fpath.'/'.$a;

  $data=file_get_contents($npath);

  file_put_contents('tempfolder/'.$a, $data);

  }

 }

How can I save files in utf-8 encoding?

如何以 utf-8 编码保存文件?

回答by user956584

Add BOM: UTF-8

添加物料清单:UTF-8

file_put_contents($myFile, "\xEF\xBB\xBF".  $content); 

回答by Arnaud Le Blanc

file_get_contents / file_put_contents will not magically convert encoding.

file_get_contents / file_put_contents 不会神奇地转换编码。

You have to convert the string explicitly; for example with iconv()or mb_convert_encoding().

您必须显式转换字符串;例如使用iconv()mb_convert_encoding()

Try this:

尝试这个:

$data = file_get_contents($npath);
$data = mb_convert_encoding($data, 'UTF-8', 'OLD-ENCODING');
file_put_contents('tempfolder/'.$a, $data);

Or alternatively, with PHP's stream filters:

或者,使用 PHP 的流过滤器:

$fd = fopen($file, 'r');
stream_filter_append($fd, 'convert.iconv.UTF-8/OLD-ENCODING');
stream_copy_to_stream($fd, fopen($output, 'w'));

回答by Alaa

<?php
function writeUTF8File($filename,$content) { 
        $f=fopen($filename,"w"); 
        # Now UTF-8 - Add byte order mark 
        fwrite($f, pack("CCC",0xef,0xbb,0xbf)); 
        fwrite($f,$content); 
        fclose($f); 
} 
?>

回答by Dennis Kreminsky

Iconvto the rescue.

图标来救援。

回答by mario

On Unix/Linux a simple shell command could be used alternatively to convert all files from a given directory:

在 Unix/Linux 上,可以使用一个简单的 shell 命令来转换给定目录中的所有文件:

 recode L1..UTF8 dir/*

Could be started via PHPs exec() as well.

也可以通过 PHP exec() 启动。

回答by Du Peng

//add BOM to fix UTF-8 in Excel
fputs($fp, $bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ));

I got this line from Cool

我从Cool那里得到了这条线

回答by Atul.Bajare

This works for me. :)

这对我有用。:)

$f=fopen($filename,"w"); 
# Now UTF-8 - Add byte order mark 
fwrite($f, pack("CCC",0xef,0xbb,0xbf)); 
fwrite($f,$content); 
fclose($f); 

回答by Aitor

If you want to use recode recursively, and filter for type, try this:

如果要递归使用重新编码并过滤类型,请尝试以下操作:

find . -name "*.html" -exec recode L1..UTF8 {} \;

回答by jacouh

This is quite useful question. I think that my solution on Windows 10 PHP7 is rather useful for people who have yet some UTF-8 conversion trouble.

这是一个很有用的问题。我认为我在 Windows 10 PHP7 上的解决方案对于有一些 UTF-8 转换问题的人来说相当有用。

Here are my steps. The PHP script calling the following function, here named utfsave.phpmust have UTF-8 encoding itself, this can be easily done by conversion on UltraEdit.

这是我的步骤。调用以下函数的 PHP 脚本,这里命名为utfsave.php本身必须具有 UTF-8 编码,这可以通过在 UltraEdit 上转换来轻松完成。

In utfsave.php, we define a function calling PHP fopen($filename, "wb"), ie, it's opened in both w write mode, and especially with b in binarymode.

在 utfsave.php 中,我们定义了一个调用 PHP fopen($filename, " wb")的函数,即它在 w 写入模式下打开,尤其是 b 在二进制模式下打开。

<?php
//
//  UTF-8 编码:
//
// fnc001: save string as a file in UTF-8:
// The resulting file is UTF-8 only if $strContent is,
// with French accents, chinese ideograms, etc..
//
function entSaveAsUtf8($strContent, $filename) {
  $fp = fopen($filename, "wb"); 
  fwrite($fp, $strContent);
  fclose($fp);
  return True;
}

//
// 0. write UTF-8 string in fly into UTF-8 file:
//
$strContent = "My string contains UTF-8 chars ie 鱼肉酒菜 for un été en France";

$filename = "utf8text.txt";

entSaveAsUtf8($strContent, $filename);


//
// 2. convert CP936 ANSI/OEM - chinese simplified GBK file into UTF-8 file:
//
$strContent = file_get_contents("cp936gbktext.txt");
$strContent = mb_convert_encoding($strContent, "UTF-8", "CP936");


$filename = "utf8text2.txt";

entSaveAsUtf8($strContent, $filename);

?>

The source file cp936gbktext.txt file content:

源文件cp936gbktext.txt文件内容:

>>Get-Content cp936gbktext.txt
My string contains UTF-8 chars ie 鱼肉酒菜 for un été en France 936 (ANSI/OEM - chinois simplifié GBK)

Running utf8save.phpon Windows 10 PHP, thus created utf8text.txt, utf8text2.txtfiles will be automatically saved in UTF-8 format.

在 Windows 10 PHP 上运行utf8save.php,从而创建utf8text.txtutf8text2.txt文件将自动以 UTF-8 格式保存。

With this method, BOM char is not required. BOM solution is bad because it causes troubles when we do sourcing an sql file for MySQL for example.

使用此方法,不需要 BOM 字符。例如,BOM 解决方案很糟糕,因为当我们为 MySQL 采购 sql 文件时,它会引起麻烦。

It's worth noting that I failed making work file_put_contents($filename, utf8_encode($mystring));for this purpose.

值得注意的是,我未能完成工作file_put_contents($filename, utf8_encode($mystring)); 以此目的。

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++

If you don't know the encoding of the source file, you can list encodings with PHP:

如果你不知道源文件的编码,你可以用 PHP 列出编码:

print_r(mb_list_encodings());

This gives a list like this:

这给出了这样的列表:

Array
(
  [0] => pass
  [1] => wchar
  [2] => byte2be
  [3] => byte2le
  [4] => byte4be
  [5] => byte4le
  [6] => BASE64
  [7] => UUENCODE
  [8] => HTML-ENTITIES
  [9] => Quoted-Printable
  [10] => 7bit
  [11] => 8bit
  [12] => UCS-4
  [13] => UCS-4BE
  [14] => UCS-4LE
  [15] => UCS-2
  [16] => UCS-2BE
  [17] => UCS-2LE
  [18] => UTF-32
  [19] => UTF-32BE
  [20] => UTF-32LE
  [21] => UTF-16
  [22] => UTF-16BE
  [23] => UTF-16LE
  [24] => UTF-8
  [25] => UTF-7
  [26] => UTF7-IMAP
  [27] => ASCII
  [28] => EUC-JP
  [29] => SJIS
  [30] => eucJP-win
  [31] => EUC-JP-2004
  [32] => SJIS-win
  [33] => SJIS-Mobile#DOCOMO
  [34] => SJIS-Mobile#KDDI
  [35] => SJIS-Mobile#SOFTBANK
  [36] => SJIS-mac
  [37] => SJIS-2004
  [38] => UTF-8-Mobile#DOCOMO
  [39] => UTF-8-Mobile#KDDI-A
  [40] => UTF-8-Mobile#KDDI-B
  [41] => UTF-8-Mobile#SOFTBANK
  [42] => CP932
  [43] => CP51932
  [44] => JIS
  [45] => ISO-2022-JP
  [46] => ISO-2022-JP-MS
  [47] => GB18030
  [48] => Windows-1252
  [49] => Windows-1254
  [50] => ISO-8859-1
  [51] => ISO-8859-2
  [52] => ISO-8859-3
  [53] => ISO-8859-4
  [54] => ISO-8859-5
  [55] => ISO-8859-6
  [56] => ISO-8859-7
  [57] => ISO-8859-8
  [58] => ISO-8859-9
  [59] => ISO-8859-10
  [60] => ISO-8859-13
  [61] => ISO-8859-14
  [62] => ISO-8859-15
  [63] => ISO-8859-16
  [64] => EUC-CN
  [65] => CP936
  [66] => HZ
  [67] => EUC-TW
  [68] => BIG-5
  [69] => CP950
  [70] => EUC-KR
  [71] => UHC
  [72] => ISO-2022-KR
  [73] => Windows-1251
  [74] => CP866
  [75] => KOI8-R
  [76] => KOI8-U
  [77] => ArmSCII-8
  [78] => CP850
  [79] => JIS-ms
  [80] => ISO-2022-JP-2004
  [81] => ISO-2022-JP-MOBILE#KDDI
  [82] => CP50220
  [83] => CP50220raw
  [84] => CP50221
  [85] => CP50222
)

If you cannot guess, you try one by one, as mb_detect_encoding() cannot do the job easily.

如果你猜不出来,你可以一一尝试,因为 mb_detect_encoding() 不能轻松完成这项工作。

回答by Le Inc

I put all together and got easy way to convert ANSI text files to "UTF-8 No Mark":

我把所有东西放在一起,得到了将 ANSI 文本文件转换为“UTF-8 无标记”的简单方法:

function filesToUTF8($searchdir,$convdir,$filetypes) {
  $get_files = glob($searchdir.'*{'.$filetypes.'}', GLOB_BRACE);
  foreach($get_files as $file) {
    $expl_path = explode('/',$file);
    $filename = end($expl_path);
    $get_file_content = file_get_contents($file);
    $new_file_content = iconv(mb_detect_encoding($get_file_content, mb_detect_order(), true), "UTF-8", $get_file_content);
    $put_new_file = file_put_contents($convdir.$filename,$new_file_content);
  }
}

Usage: filesToUTF8('C:/Temp/','C:/Temp/conv_files/','php,txt');

用法:filesToUTF8('C:/Temp/','C:/Temp/conv_files/','php,txt');