C++ 将 unicode 转换为字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11040703/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
convert unicode to char
提问by JJunior
How can I convert a Unicode string to a char*
or char* const
in embarcaderoc++ ?
如何在embarcaderoc++中将Unicode 字符串转换为 achar*
或?char* const
回答by Tuomas
String text = "Hello world";
char *txt = AnsiString(text).c_str();
Older text.t_str() is now AnsiString(String).c_str()
回答by bames53
"Unicode string" really isn't specific enough to know what your source data is, but you probably mean 'UTF-16 string stored as wchar_t array' since that's what most people who don't know the correct terminology use.
“Unicode 字符串”确实不够具体,无法知道您的源数据是什么,但您的意思可能是“存储为 wchar_t 数组的 UTF-16 字符串”,因为这是大多数不知道正确术语的人所使用的。
"char*" also isn't enough to know what you want to target, although maybe "embarcadero" has some convention. I'll just assume you want UTF-8 data unless you mention otherwise.
“char*”也不足以知道你想要定位什么,尽管“embarcadero”可能有一些约定。除非您另有说明,否则我将假设您需要 UTF-8 数据。
Also I'll limit my example to what works in VS2010
此外,我会将我的示例限制为在 VS2010 中有效的内容
// your "Unicode" string
wchar_t const * utf16_string = L"Hello, World!";
// #include <codecvt>
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>,wchar_t> convert;
std::string utf8_string = convert.to_bytes(utf16_string);
This assumes that wchar_t strings are UTF-16, as is the case on Windows, but otherwise is portable code.
这假设 wchar_t 字符串是 UTF-16,就像在 Windows 上的情况一样,否则是可移植的代码。
回答by Kerrek SB
You can reinterpret any array as an array of char pointers legally. So if your Unicode data comes in 4-byte code units like
您可以合法地将任何数组重新解释为字符指针数组。因此,如果您的 Unicode 数据采用 4 字节代码单元,例如
char32_t data[100];
then you can access it as a char array:
然后您可以将其作为字符数组访问:
char const * p = reinterpret_cast<char const*>(data);
for (std::size_t i = 0; i != sizeof data; ++i)
{
std::printf("Byte %03zu is 0x%02X.\n", i, p[i]);
}
That way, you can examine the individual bytes of your Unicode data one by one.
这样,您就可以一一检查 Unicode 数据的各个字节。
(That has of course nothing to do with converting the encodingof your text. For that, use a library like iconv
or ICU.)
(这当然与转换文本编码无关。为此,请使用像iconv
或 ICU这样的库。)
回答by AWalkmen
If you work with Windows:
如果您使用 Windows:
//#include <windows.h>
u16string utext = u"объява";
char text[0x100];
WideCharToMultiByte(CP_UTF8,NULL,(const wchar_t*)(utext.c_str()),-1,text,-1,NULL,NULL);
cout << text;
We can't use std::wstring_convert, wherefore is not available in MinGW 4.9.2.
我们不能使用 std::wstring_convert,因此在 MinGW 4.9.2 中不可用。