windows UnicodeString 到 char* (UTF-8)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3150581/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-15 14:44:30  来源:igfitidea点击:

UnicodeString to char* (UTF-8)

c++windowsutf-8internationalizationicu

提问by zfedsa

I am using the ICU library in C++ on OS X. All of my strings are UnicodeStrings, but I need to use system calls like fopen, fread and so forth. These functions take const char* or char* as arguments. I have read that OS X supports UTF-8 internally, so that all I need to do is convert my UnicodeString to UTF-8, but I don't know how to do that.

我在 OS X 上使用 C++ 中的 ICU 库。我所有的字符串都是 UnicodeStrings,但我需要使用 fopen、fread 等系统调用。这些函数将 const char* 或 char* 作为参数。我已经读到 OS X 在内部支持 UTF-8,所以我需要做的就是将我的 UnicodeString 转换为 UTF-8,但我不知道该怎么做。

UnicodeString has a toUTF8() member function, but it returns a ByteSink. I've also found these examples: http://source.icu-project.org/repos/icu/icu/trunk/source/samples/ucnv/convsamp.cppand read about using a converter, but I'm still confused. Any help would be much appreciated.

UnicodeString 有一个 toUTF8() 成员函数,但它返回一个 ByteSink。我还找到了这些示例:http: //source.icu-project.org/repos/icu/icu/trunk/source/samples/ucnv/convsamp.cpp并阅读了有关使用转换器的信息,但我仍然感到困惑. 任何帮助将非常感激。

采纳答案by Steven R. Loomis

call UnicodeString::extract(...)to extract into a char*, pass NULL for the converter to get the default converter (which is in the charset which your OS will be using).

调用UnicodeString::extract(...)以提取到 char*,为转换器传递 NULL 以获得默认转换器(在您的操作系统将使用的字符集中)。

回答by Map X

ICU User Guide > UTF-8provides methods and descriptions of doing that.

ICU 用户指南 > UTF-8提供了这样做的方法和描述。

The simplest way to use UTF-8 strings in UTF-16 APIs is via the C++ icu::UnicodeStringmethods fromUTF8(const StringPiece &utf8)and toUTF8String(StringClass &result). There is also toUTF8(ByteSink &sink).

在 UTF-16 API 中使用 UTF-8 字符串的最简单方法是通过 C++icu::UnicodeString方法fromUTF8(const StringPiece &utf8)toUTF8String(StringClass &result). 还有toUTF8(ByteSink &sink)

And extract()is not prefered now.

extract()现在不优先。

Note: icu::UnicodeStringhas constructors, setTo()and extract()methods which take either a converter object or a charset name. These can be used for UTF-8, but are not as efficient or convenient as the fromUTF8()/toUTF8()/toUTF8String()methods mentioned above.

注意:icu::UnicodeString具有构造函数setTo()extract()采用转换器对象或字符集名称的方法。这些可用于UTF-8,但效率不高或方便作为fromUTF8()/ toUTF8()/toUTF8String()方法如上所述。

回答by gsf

This will work:

这将起作用:

std::string utf8;
uStr.toUTF8String(utf8);