如何将 UTF-8 编码的字符串写入 Windows 中的文件,在 C++ 中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3973582/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-15 15:26:32  来源:igfitidea点击:

How do I write a UTF-8 encoded string to a file in windows, in C++

c++windowsunicodefile-ioutf-8

提问by NSA

I have a string that may or may not have unicode characters in it, I am trying to write that to a file on windows. Below I have posted a sample bit of code, my problem is that when I fopen and read the values back out windows, they are all being interpreted as UTF-16 characters.

我有一个字符串,其中可能包含也可能不包含 unicode 字符,我正在尝试将其写入 Windows 上的文件。下面我发布了一个示例代码,我的问题是当我打开并从窗口读取值时,它们都被解释为 UTF-16 字符。

char* x = "Fool";
FILE* outFile = fopen( "Serialize.pef", "w+,ccs=UTF-8");
fwrite(x,strlen(x),1,outFile);
fclose(outFile);

char buffer[12];
buffer[11]=NULL;
outFile = fopen( "Serialize.pef", "r,ccs=UTF-8");
fread(buffer,1,12,outFile);
fclose(outFile);

The characters are also interpreted as UTF-16 if I open the file in wordpad etc. What am I doing wrong?

如果我在写字板等中打开文件,这些字符也会被解释为 UTF-16。我做错了什么?

回答by Hans Passant

Yes, when you specify that the text file should be encoded in UTF-8, the CRT implicitly assumes that you'll be writing Unicode text to the file. Not doing so doesn't make sense, you wouldn't need UTF-8. This will work proper:

是的,当您指定文本文件应以 UTF-8 编码时,CRT 隐式假定您将向文件写入 Unicode 文本。不这样做没有意义,您不需要 UTF-8。这将正常工作:

wchar_t* x = L"Fool";
FILE* outFile = fopen( "Serialize.txt", "w+,ccs=UTF-8");
fwrite(x, wcslen(x) * sizeof(wchar_t), 1, outFile);
fclose(outFile);

Or:

或者:

char* x = "Fool";
FILE* outFile = fopen( "Serialize.txt", "w+,ccs=UTF-8");
fwprintf(outFile, L"%hs", x);
fclose(outFile);

回答by Yarkov Anton

It is easy if you use the C++11standard (because there are a lot of additional includes like "utf8"which solves this problems forever).

如果您使用C++11标准很容易(因为有很多额外的包含"utf8"可以永远解决这个问题)。

But if you want to use multi-platform code with older standards, you can use this method to write with streams:

但是如果你想使用具有旧标准的多平台代码,你可以使用这种方法来编写带有流的代码:

  1. Read the article about UTF converter for streams
  2. Add stxutif.hto your project from sources above
  3. Open the file in ANSI mode and add the BOM to the start of a file, like this:

    std::ofstream fs;
    fs.open(filepath, std::ios::out|std::ios::binary);
    
    unsigned char smarker[3];
    smarker[0] = 0xEF;
    smarker[1] = 0xBB;
    smarker[2] = 0xBF;
    
    fs << smarker;
    fs.close();
    
  4. Then open the file as UTFand write your content there:

    std::wofstream fs;
    fs.open(filepath, std::ios::out|std::ios::app);
    
    std::locale utf8_locale(std::locale(), new utf8cvt<false>);
    fs.imbue(utf8_locale); 
    
    fs << .. // Write anything you want...
    
  1. 阅读有关流的 UTF 转换器的文章
  2. stxutif.h从上面的来源添加到您的项目
  3. 以 ANSI 模式打开文件并将 BOM 添加到文件的开头,如下所示:

    std::ofstream fs;
    fs.open(filepath, std::ios::out|std::ios::binary);
    
    unsigned char smarker[3];
    smarker[0] = 0xEF;
    smarker[1] = 0xBB;
    smarker[2] = 0xBF;
    
    fs << smarker;
    fs.close();
    
  4. 然后打开文件UTF并在那里写下你的内容:

    std::wofstream fs;
    fs.open(filepath, std::ios::out|std::ios::app);
    
    std::locale utf8_locale(std::locale(), new utf8cvt<false>);
    fs.imbue(utf8_locale); 
    
    fs << .. // Write anything you want...