C++ 在 Windows 控制台应用程序中输出 unicode 字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2492077/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Output unicode strings in Windows console app
提问by Andrew
Hi I was trying to output unicode string to a console with iostreamsand failed.
嗨,我试图用iostreams将 unicode 字符串输出到控制台,但失败了。
I found this: Using unicode font in c++ console appand this snippet works.
我发现了这一点: 在 C++ 控制台应用程序中使用 unicode 字体并且此代码段有效。
SetConsoleOutputCP(CP_UTF8);
wchar_t s[] = L"èé?T?л?Σ??a";
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize];
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
wprintf(L"%S", m);
However, I did not find any way to output unicode correctly with iostreams. Any suggestions?
但是,我没有找到任何使用 iostreams 正确输出 unicode 的方法。有什么建议?
This does not work:
这不起作用:
SetConsoleOutputCP(CP_UTF8);
utf8_locale = locale(old_locale,new boost::program_options::detail::utf8_codecvt_facet());
wcout.imbue(utf8_locale);
wcout << L"?Hola!" << endl;
EDITI could not find any other solution than to wrap this snippet around in a stream. Hope, somebody has better ideas.
编辑我找不到任何其他解决方案,只能将此代码段包装在流中。希望,有人有更好的想法。
//Unicode output for a Windows console
ostream &operator-(ostream &stream, const wchar_t *s)
{
int bufSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char *buf = new char[bufSize];
WideCharToMultiByte(CP_UTF8, 0, s, -1, buf, bufSize, NULL, NULL);
wprintf(L"%S", buf);
delete[] buf;
return stream;
}
ostream &operator-(ostream &stream, const wstring &s)
{
stream - s.c_str();
return stream;
}
回答by DuckMaestro
I have verified a solution here using Visual Studio 2010. Via this MSDN articleand MSDN blog post. The trick is an obscure call to _setmode(..., _O_U16TEXT)
.
我在这里使用 Visual Studio 2010 验证了一个解决方案。通过这篇MSDN 文章和MSDN 博客文章。诀窍是对_setmode(..., _O_U16TEXT)
.
Solution:
解决方案:
#include <iostream>
#include <io.h>
#include <fcntl.h>
int wmain(int argc, wchar_t* argv[])
{
_setmode(_fileno(stdout), _O_U16TEXT);
std::wcout << L"Testing unicode -- English -- Ελληνικ? -- Espa?ol." << std::endl;
}
Screenshot:
截屏:
回答by David
Unicode Hello World in Chinese
Unicode Hello World 中文
Here is a Hello World in Chinese. Actually it is just "Hello". I tested this on Windows 10, but I think it might work since Windows Vista. Before Windows Vista it will be hard, if you want a programmatic solution, instead of configuring the console / registry etc. Maybe have a look here if you really need to do this on Windows 7: Change console Font Windows 7
这是一个中文的Hello World。其实就是“你好”。我在 Windows 10 上对此进行了测试,但我认为它可能从 Windows Vista 开始工作。在 Windows Vista 之前,如果您想要一个程序化的解决方案,而不是配置控制台/注册表等,这将很难。如果您确实需要在 Windows 7 上执行此操作,请查看此处: 更改控制台字体 Windows 7
I dont want to claim this is the only solution, but this is what worked for me.
我不想声称这是唯一的解决方案,但这对我有用。
Outline
大纲
- Unicode project setup
- Set the console codepage to unicode
- Find and use a font that supports the characters you want to display
- Use the locale of the language you want to display
- Use the wide character output i.e.
std::wcout
- Unicode 项目设置
- 将控制台代码页设置为 unicode
- 查找并使用支持您要显示的字符的字体
- 使用您要显示的语言的区域设置
- 使用宽字符输出即
std::wcout
1 Project Setup
1 项目设置
I am using Visual Studio 2017 CE. I created a blank console app. The default settings are alright. But if you experience problems or you use a different ide you might want to check these:
我正在使用 Visual Studio 2017 CE。我创建了一个空白的控制台应用程序。默认设置没问题。但是,如果您遇到问题或使用不同的 ide,您可能需要检查这些:
In your project properties find configuration properties -> General -> Project Defaults -> Character Set. It should be "Use Unicode Character Set" not "Multi-Byte".
This will define _UNICODE
and UNICODE
preprocessor macros for you.
在您的项目属性中找到配置属性 -> 常规 -> 项目默认值 -> 字符集。它应该是“使用 Unicode 字符集”而不是“多字节”。这将为您定义_UNICODE
和UNICODE
预处理宏。
int wmain(int argc, wchar_t* argv[])
Also I think we should use wmain
function instead of main
. They both work, but in a unicode environment wmain
may be more convenient.
另外我认为我们应该使用wmain
function 而不是main
. 它们都可以工作,但在 unicode 环境中wmain
可能更方便。
Also my source files are UTF-16-LE encoded, which seems to be the default in Visual Studio 2017.
我的源文件也是 UTF-16-LE 编码的,这似乎是 Visual Studio 2017 中的默认值。
2. Console Codepage
2. 控制台代码页
This is quite obvious. We need the unicode codepage in the console.
If you want to check your default codepage, just open a console and type chcp
withou any arguments.
We have to change it to 65001, which is the UTF-8 codepage. Windows Codepage IdentifiersThere is a preprocessor macro for that codepage: CP_UTF8
.
I needed to set both, the input and output codepage. When I omitted either one, the output was incorrect.
这是很明显的。我们需要控制台中的 unicode 代码页。如果你想检查你的默认代码页,只需打开一个控制台并输入chcp
不带任何参数。我们必须将其更改为 65001,即 UTF-8 代码页。Windows 代码页标识符该代码页有一个预处理器宏:CP_UTF8
. 我需要同时设置输入和输出代码页。当我省略任何一个时,输出不正确。
SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8);
You might also want to check the boolean return values of those functions.
您可能还想检查这些函数的布尔返回值。
3. Choose a Font
3. 选择字体
Until yet I didnt find a console font that supports every character. So I had to choose one. If you want to output characters which are partly only available in one font and partly in another font, then I believe it is impossible to find a solution. Only maybe if there is a font out there that supports every character. But also I didnt look into how to install a font.
直到现在我还没有找到支持每个字符的控制台字体。所以我不得不选择一个。如果您想输出部分仅在一种字体中可用而部分在另一种字体中可用的字符,那么我相信找不到解决方案。只有当有一种字体支持每个字符时。但我也没有研究如何安装字体。
I think it is not possible to use two different fonts in the same console window at the same time.
我认为不可能在同一个控制台窗口中同时使用两种不同的字体。
How to find a compatible font? Open your console, go to the properties of the console window by clicking on the icon in the upper left of the window. Go to the fonts tab and choose a font and click ok. Then try to enter your characters in the console window. Repeat this until you find a font you can work with. Then note down the name of the font.
如何找到兼容的字体?打开您的控制台,通过单击窗口左上角的图标转到控制台窗口的属性。转到字体选项卡并选择一种字体,然后单击确定。然后尝试在控制台窗口中输入您的字符。重复此操作,直到找到可以使用的字体。然后记下字体的名称。
Also you can change the size of the font in the properties window. If you found a size you are happy with, note down the size values that are displayed in the properties window in the section "selected font". It will show width and height in pixels.
您也可以在属性窗口中更改字体大小。如果您找到了满意的尺寸,请记下属性窗口中“所选字体”部分中显示的尺寸值。它将以像素为单位显示宽度和高度。
To actually set the font programmatically you use:
要以编程方式实际设置字体,请使用:
CONSOLE_FONT_INFOEX fontInfo;
// ... configure fontInfo
SetCurrentConsoleFontEx(hConsole, false, &fontInfo);
See my example at the end of this answer for details. Or look it up in the fine manual: SetCurrentConsoleFont. This function only exists since Windows Vista.
有关详细信息,请参阅本答案末尾的示例。或者在精美的手册中查找:SetCurrentConsoleFont。此功能仅从 Windows Vista 开始存在。
4. Set the locale
4. 设置语言环境
You will need to set the locale to the locale of the language which characters you want to print.
您需要将语言环境设置为要打印字符的语言的语言环境。
char* a = setlocale(LC_ALL, "chinese");
The return value is interesting. It will contain a string to describe exactly wich locale was chosen.
Just give it a try :-)
I tested with chinese
and german
.
More info: setlocale
返回值很有趣。它将包含一个字符串来准确描述选择的语言环境。试一试吧:-) 我用chinese
和进行了测试german
。更多信息:setlocale
5. Use wide character output
5.使用宽字符输出
Not much to say here. If you want to output wide characters, use this for example:
这里不多说。如果要输出宽字符,请使用以下示例:
std::wcout << L"你好" << std::endl;
Oh, and dont forget the L
prefix for wide characters!
And if you type literal unicode characters like this in the source file, the source file must be unicode encoded. Like the default in Visual Studio is UTF-16-LE. Or maybe use notepad++and set the encoding to UCS-2 LE BOM
.
哦,不要忘记L
宽字符的前缀!如果您在源文件中键入这样的文字 unicode 字符,则源文件必须是 unicode 编码的。就像 Visual Studio 中的默认值是 UTF-16-LE。或者也许使用记事本++并将编码设置为UCS-2 LE BOM
.
Example
例子
Finally I put it all together as an example:
最后我把它们放在一起作为例子:
#include <Windows.h>
#include <iostream>
#include <io.h>
#include <fcntl.h>
#include <locale.h>
#include <wincon.h>
int wmain(int argc, wchar_t* argv[])
{
SetConsoleTitle(L"My Console Window - 你好");
HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE);
char* a = setlocale(LC_ALL, "chinese");
SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8);
CONSOLE_FONT_INFOEX fontInfo;
fontInfo.cbSize = sizeof(fontInfo);
fontInfo.FontFamily = 54;
fontInfo.FontWeight = 400;
fontInfo.nFont = 0;
const wchar_t myFont[] = L"KaiTi";
fontInfo.dwFontSize = { 18, 41 };
std::copy(myFont, myFont + (sizeof(myFont) / sizeof(wchar_t)), fontInfo.FaceName);
SetCurrentConsoleFontEx(hConsole, false, &fontInfo);
std::wcout << L"Hello World!" << std::endl;
std::wcout << L"你好!" << std::endl;
return 0;
}
Cheers !
干杯!
回答by Puppy
The wcout must have the locale set differently to the CRT. Here's how it can be fixed:
wcout 的区域设置必须与 CRT 不同。以下是修复方法:
int _tmain(int argc, _TCHAR* argv[])
{
char* locale = setlocale(LC_ALL, "English"); // Get the CRT's current locale.
std::locale lollocale(locale);
setlocale(LC_ALL, locale); // Restore the CRT.
std::wcout.imbue(lollocale); // Now set the std::wcout to have the locale that we got from the CRT.
std::wcout << L"?Hola!";
std::cin.get();
return 0;
}
I just tested it, and it displays the string here absolutely fine.
我刚刚测试了它,它在这里显示的字符串绝对没问题。
回答by Henrik Haftmann
SetConsoleCP() and chcpdoes not the same!
SetConsoleCP() 和chcp不一样!
Take this program snippet:
以这个程序片段为例:
SetConsoleCP(65001) // 65001 = UTF-8
static const char s[]="tr?nenüberstr?mt?\n";
DWORD slen=lstrlen(s);
WriteConsoleA(GetStdHandle(STD_OUTPUT_HANDLE),s,slen,&slen,NULL);
The source code must be saved as UTF-8 withoutBOM (Byte Order Mark; Signature). Then, the Microsoft compiler cl.exetakes the UTF-8 strings as-is.
If this code is saved withBOM, cl.exe transcodes the string to ANSI (i.e. CP1252), which doesn't match to CP65001 (= UTF-8).
源代码必须保存为没有BOM(字节顺序标记;签名)的UTF-8 。然后,Microsoft 编译器cl.exe按原样采用 UTF-8 字符串。
如果此代码与BOM 一起保存,cl.exe 会将字符串转码为 ANSI(即 CP1252),这与 CP65001 (= UTF-8) 不匹配。
Change the display font to Lucidia Console, otherwise, UTF-8 output will not work at all.
将显示字体更改为Lucidia Console,否则,UTF-8 输出将根本无法工作。
- Type:
chcp
- Answer:
850
- Type:
test.exe
- Answer:
tr├?nen├╝berstr├?mt??ó
- Type:
chcp
- Answer:
65001
- This setting has changed bySetConsoleCP()
but with no useful effect. - Type:
chcp 65001
- Type:
test.exe
- Answer:
tr?nenüberstr?mt?
- All OK now.
- 类型:
chcp
- 回答:
850
- 类型:
test.exe
- 回答:
tr├?nen├╝berstr├?mt??ó
- 类型:
chcp
- 答:
65001
- 此设置已更改SetConsoleCP()
但没有任何有用的效果。 - 类型:
chcp 65001
- 类型:
test.exe
- 答:
tr?nenüberstr?mt?
- 现在一切正常。
Tested with: German Windows XP SP3
测试:德国 Windows XP SP3
回答by call me Steve
I don't think there is an easy answer. looking at Console Code Pagesand SetConsoleCP Functionit seems that you will need to set-up an appropriate codepage for the character-set you're going to output.
我认为没有一个简单的答案。查看控制台代码页和SetConsoleCP 函数,您似乎需要为要输出的字符集设置适当的代码页。
回答by newtover
Recenly I wanted to stream unicode from Python to windows console and here is the minimum I needed to make:
最近我想将 unicode 从 Python 流式传输到 Windows 控制台,这是我需要做的最低要求:
- You should set console font to the one covering unicode symbols. There is not a wide choise: Console properties > Font > Lucida Console
- You should change the current console codepage: run
chcp 65001
in the Console or use the corresponding method in the C++ code - write to console using WriteConsoleW
- 您应该将控制台字体设置为覆盖 unicode 符号的字体。没有广泛的选择:控制台属性 > 字体 > Lucida Console
- 您应该更改当前的控制台代码页:
chcp 65001
在控制台中运行或使用 C++ 代码中的相应方法 - 使用 WriteConsoleW 写入控制台
Look through an interesing article about java unicode on windows console
在 Windows 控制台上浏览一篇关于java unicode的有趣文章
Besides, in Python you can not write to default sys.stdout in this case, you will need to substitute it with something using os.write(1, binarystring) or direct call to a wrapper around WriteConsoleW. Seems like in C++ you will need to do the same.
此外,在这种情况下,在 Python 中您不能写入默认的 sys.stdout,您需要使用 os.write(1, binarystring) 或直接调用 WriteConsoleW 的包装器来替换它。似乎在 C++ 中你需要做同样的事情。
回答by Afriza N. Arief
First, sorry I probably don't have the fonts required so I cannot test it yet.
首先,对不起,我可能没有所需的字体,所以我还不能测试它。
Something looks a bit fishy here
这里的东西看起来有点可疑
// the following is said to be working
SetConsoleOutputCP(CP_UTF8); // output is in UTF8
wchar_t s[] = L"èé?T?л?Σ??a";
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize];
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
wprintf(L"%S", m); // <-- upper case %S in wprintf() is used for MultiByte/utf-8
// lower case %s in wprintf() is used for WideChar
printf("%s", m); // <-- does this work as well? try it to verify my assumption
while
尽管
// the following is said to have problem
SetConsoleOutputCP(CP_UTF8);
utf8_locale = locale(old_locale,
new boost::program_options::detail::utf8_codecvt_facet());
wcout.imbue(utf8_locale);
wcout << L"?Hola!" << endl; // <-- you are passing wide char.
// have you tried passing the multibyte equivalent by converting to utf8 first?
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize];
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
cout << m << endl;
what about
关于什么
// without setting locale to UTF8, you pass WideChars
wcout << L"?Hola!" << endl;
// set locale to UTF8 and use cout
SetConsoleOutputCP(CP_UTF8);
cout << utf8_encoded_by_converting_using_WideCharToMultiByte << endl;
回答by Joma
Default encoding on:
默认编码:
- Windows UTF-16.
- Linux UTF-8.
- MacOS UTF-8.
- Windows UTF-16。
- Linux UTF-8。
- MacOS UTF-8。
My solution Steps, includes null chars \0 (avoid truncated). Without using functions on windows.h header:
我的解决方案步骤,包括空字符 \0 (避免被截断)。不使用 windows.h 头文件上的函数:
- Add Macros to detect Platform.
- 添加宏来检测平台。
#if defined (_WIN32)
#define WINDOWSLIB 1
#elif defined (__ANDROID__) || defined(ANDROID)//Android
#define ANDROIDLIB 1
#elif defined (__APPLE__)//iOS, Mac OS
#define MACOSLIB 1
#elif defined (__LINUX__) || defined(__gnu_linux__) || defined(__linux__)//_Ubuntu - Fedora - Centos - RedHat
#define LINUXLIB 1
#endif
- Create conversion functions std::wstring to std::string or viceversa.
- 创建转换函数 std:: wstring 到 std::string 或反之亦然。
#include <locale>
#include <iostream>
#include <string>
#ifdef WINDOWSLIB
#include <Windows.h>
#endif
using namespace std::literals::string_literals;
// Convert std::wstring to std::string
std::string WidestringToString(const std::wstring& wstr, const std::string& locale)
{
if (wstr.empty())
{
return std::string();
}
size_t pos;
size_t begin = 0;
std::string ret;
size_t size;
#ifdef WINDOWSLIB
_locale_t lc = _create_locale(LC_ALL, locale.c_str());
pos = wstr.find(static_cast<wchar_t>(0), begin);
while (pos != std::wstring::npos && begin < wstr.length())
{
std::wstring segment = std::wstring(&wstr[begin], pos - begin);
_wcstombs_s_l(&size, nullptr, 0, &segment[0], _TRUNCATE, lc);
std::string converted = std::string(size, 0);
_wcstombs_s_l(&size, &converted[0], size, &segment[0], _TRUNCATE, lc);
ret.append(converted);
begin = pos + 1;
pos = wstr.find(static_cast<wchar_t>(0), begin);
}
if (begin <= wstr.length()) {
std::wstring segment = std::wstring(&wstr[begin], wstr.length() - begin);
_wcstombs_s_l(&size, nullptr, 0, &segment[0], _TRUNCATE, lc);
std::string converted = std::string(size, 0);
_wcstombs_s_l(&size, &converted[0], size, &segment[0], _TRUNCATE, lc);
converted.resize(size - 1);
ret.append(converted);
}
_free_locale(lc);
#elif defined LINUXLIB
std::string currentLocale = setlocale(LC_ALL, nullptr);
setlocale(LC_ALL, locale.c_str());
pos = wstr.find(static_cast<wchar_t>(0), begin);
while (pos != std::wstring::npos && begin < wstr.length())
{
std::wstring segment = std::wstring(&wstr[begin], pos - begin);
size = wcstombs(nullptr, segment.c_str(), 0);
std::string converted = std::string(size, 0);
wcstombs(&converted[0], segment.c_str(), converted.size());
ret.append(converted);
ret.append({ 0 });
begin = pos + 1;
pos = wstr.find(static_cast<wchar_t>(0), begin);
}
if (begin <= wstr.length()) {
std::wstring segment = std::wstring(&wstr[begin], wstr.length() - begin);
size = wcstombs(nullptr, segment.c_str(), 0);
std::string converted = std::string(size, 0);
wcstombs(&converted[0], segment.c_str(), converted.size());
ret.append(converted);
}
setlocale(LC_ALL, currentLocale.c_str());
#elif defined MACOSLIB
#endif
return ret;
}
// Convert std::string to std::wstring
std::wstring StringToWideString(const std::string& str, const std::string& locale)
{
if (str.empty())
{
return std::wstring();
}
size_t pos;
size_t begin = 0;
std::wstring ret;
size_t size;
#ifdef WINDOWSLIB
_locale_t lc = _create_locale(LC_ALL, locale.c_str());
pos = str.find(static_cast<char>(0), begin);
while (pos != std::string::npos) {
std::string segment = std::string(&str[begin], pos - begin);
std::wstring converted = std::wstring(segment.size() + 1, 0);
_mbstowcs_s_l(&size, &converted[0], converted.size(), &segment[0], _TRUNCATE, lc);
converted.resize(size - 1);
ret.append(converted);
ret.append({ 0 });
begin = pos + 1;
pos = str.find(static_cast<char>(0), begin);
}
if (begin < str.length()) {
std::string segment = std::string(&str[begin], str.length() - begin);
std::wstring converted = std::wstring(segment.size() + 1, 0);
_mbstowcs_s_l(&size, &converted[0], converted.size(), &segment[0], _TRUNCATE, lc);
converted.resize(size - 1);
ret.append(converted);
}
_free_locale(lc);
#elif defined LINUXLIB
std::string currentLocale = setlocale(LC_ALL, nullptr);
setlocale(LC_ALL, locale.c_str());
pos = str.find(static_cast<char>(0), begin);
while (pos != std::string::npos) {
std::string segment = std::string(&str[begin], pos - begin);
std::wstring converted = std::wstring(segment.size(), 0);
size = mbstowcs(&converted[0], &segment[0], converted.size());
converted.resize(size);
ret.append(converted);
ret.append({ 0 });
begin = pos + 1;
pos = str.find(static_cast<char>(0), begin);
}
if (begin < str.length()) {
std::string segment = std::string(&str[begin], str.length() - begin);
std::wstring converted = std::wstring(segment.size(), 0);
size = mbstowcs(&converted[0], &segment[0], converted.size());
converted.resize(size);
ret.append(converted);
}
setlocale(LC_ALL, currentLocale.c_str());
#elif defined MACOSLIB
#endif
return ret;
}
- Print std::string. Check RawString Suffix.
- 打印 std::string。检查RawString 后缀。
Linux Code. Print directly std::string using std::cout.
If you have std::wstring.
1. Convert to std::string.
2. Print with std::cout.
Linux 代码。使用 std::cout 直接打印 std::string。
如果你有 std::wstring。
1. 转换为 std::string。
2. 使用 std::cout 打印。
std::wstring x = L"void WriteUnicodeLine(const std::string& s)
{
#ifdef WINDOWSLIB
WriteUnicode(s);
std::cout << std::endl;
#elif defined LINUXLIB
std::cout << s << std::endl;
#elif defined MACOSLIB
#endif
}
void WriteUnicode(const std::string& s)
{
#ifdef WINDOWSLIB
std::wstring unicode = Insane::String::Strings::StringToWideString(s);
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), unicode.c_str(), static_cast<DWORD>(unicode.length()), nullptr, nullptr);
#elif defined LINUXLIB
std::cout << s;
#elif defined MACOSLIB
#endif
}
void WriteUnicodeLineW(const std::wstring& ws)
{
#ifdef WINDOWSLIB
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), ws.c_str(), static_cast<DWORD>(ws.length()), nullptr, nullptr);
std::cout << std::endl;
#elif defined LINUXLIB
std::cout << String::Strings::WidestringToString(ws)<<std::endl;
#elif defined MACOSLIB
#endif
}
void WriteUnicodeW(const std::wstring& ws)
{
#ifdef WINDOWSLIB
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), ws.c_str(), static_cast<DWORD>(ws.length()), nullptr, nullptr);
#elif defined LINUXLIB
std::cout << String::Strings::WidestringToString(ws);
#elif defined MACOSLIB
#endif
}
std::wstring x = L"const char* umessage = "Hello!\nПривет!\nПрив?т!\nΧαιρετ?σματα!\nHelló!\nHall?!\n";
...
#include <console.hpp>
#include <ios>
...
std::ostream& cout = io::console::out_stream();
cout << umessage
<< 1234567890ull << '\n'
<< 123456.78e+09 << '\n'
<< 12356.789e+10L << '\n'
<< std::hex << 0xCAFEBABE
<< std::endl;
##代码##1日本ABC##代码##DE##代码##F##代码##G##代码##"s;
std::string result = WidestringToString(x, "en_US.UTF-8");
WriteLineUnicode(u8"RESULT" + result);
WriteLineUnicode(u8"RESULT_SIZE" + std::to_string(result.size()));
1日本ABC##代码##DE##代码##F##代码##G##代码##"s;
std::string result = WidestringToString(x, "en_US.UTF-8");
std::cout << "RESULT=" << result << std::endl;
std::cout << "RESULT_SIZE=" << result.size() << std::endl;
On Windows if you need to print unicode. We need to use WriteConsolefor print unicode chars from std::wstring or std::string.
在 Windows 上,如果您需要打印 unicode。我们需要使用WriteConsole从 std::wstring 或 std::string 打印 unicode 字符。
##代码##Windows Code. Using WriteLineUnicode or WriteUnicode function. Same code can be used for Linux.
窗口代码。使用 WriteLineUnicode 或 WriteUnicode 函数。相同的代码可用于 Linux。
##代码##Finally on Windows. You need a powerfull and complete support for unicode chars in console.I recommend ConEmuand set as default terminal on Windows.
最后在 Windows 上。您需要对控制台中的 unicode 字符提供强大而完整的支持。我推荐ConEmu并在 Windows 上设置为默认终端。
Test on Microsoft Visual Studio and Jetbrains Clion.
在 Microsoft Visual Studio 和 Jetbrains Clion 上进行测试。
- Tested on Microsoft Visual Studio 2017 with VC++; std=c++17. (Windows Project)
- Tested on Microsoft Visual Studio 2017 with g++; std=c++17. (Linux Project)
- Tested on Jetbrains Clion 2018.3 with g++; std=c++17. (Linux Toolchain / Remote)
- 使用 VC++ 在 Microsoft Visual Studio 2017 上测试;标准=c++17。(Windows 项目)
- 在 Microsoft Visual Studio 2017 上使用 g++ 进行测试;标准=c++17。(Linux 项目)
- 在 Jetbrains Clion 2018.3 上使用 g++ 进行测试;标准=c++17。(Linux 工具链/远程)
QA
质量保证
Q.Why you not use
<codecvt>
header functions and classes?.
A.Deprecate Removed or deprecated featuresimpossible build on VC++, but no problems on g++. I prefer 0 warnings and headaches.Q.wstring on Windows are interchan.
A.Deprecate Removed or deprecated featuresimpossible build on VC++, but no problems on g++. I prefer 0 warnings and headaches.Q.std ::wstring is cross platform?
A.No. std::wstring uses wchar_t elements. On Windows wchar_t size is 2 bytes, each character is stored in UTF-16 units, if character is bigger than U+FFFF, the character is represented in two UTF-16 units(2 wchar_t elements) called surrogate pairs. On Linux wchar_t size is 4 bytes each character is stored in one wchar_t element, no needed surrogate pairs. Check Standard data types on UNIX, Linux, and Windows.Q.std ::string is cross platform?
A.Yes. std::string uses char elements. char type is guaranted that is same byte size in all compilers. char type size is 1 byte. Check Standard data types on UNIX, Linux, and Windows.
问:为什么不使用
<codecvt>
头函数和类?
A.弃用 删除或弃用的功能不可能在 VC++ 上构建,但在 g++ 上没有问题。我更喜欢 0 警告和头痛。Q.Windows 上的 wstring 是互通的。
A.弃用 删除或弃用的功能不可能在 VC++ 上构建,但在 g++ 上没有问题。我更喜欢 0 警告和头痛。问:std ::wstring 是跨平台的吗?
A.否。 std::wstring 使用 wchar_t 元素。在 Windows 上 wchar_t 大小为 2 个字节,每个字符以 UTF-16 单元存储,如果字符大于 U+FFFF,则字符以两个 UTF-16 单元(2 个 wchar_t 元素)表示,称为代理对。在 Linux 上 wchar_t 大小是 4 个字节,每个字符存储在一个 wchar_t 元素中,不需要代理对。检查UNIX、Linux 和 Windows 上的标准数据类型。问:std ::string 是跨平台的吗?
答:是的。std::string 使用字符元素。保证 char 类型在所有编译器中具有相同的字节大小。char 类型大小为 1 个字节。检查UNIX、Linux 和 Windows 上的标准数据类型。
回答by Victor Gubin
There are a few issues with the mswcrt and io streams.
mswcrt 和 io 流存在一些问题。
- Trick _setmode(_fileno(stdout), _O_U16TEXT); working only for MS VC++ not MinGW-GCC. Moreover sometimes it is brings to crashes depending on Windows configuration.
- SetConsoleCP(65001) for UTF-8. May fail in many multibyte character scenarios, but is is always OK for UTF-16LE
- You need to restore previews console codepage on application exit.
- 技巧_setmode(_fileno(stdout), _O_U16TEXT); 仅适用于 MS VC++ 而不是 MinGW-GCC。此外,有时它会导致崩溃,具体取决于 Windows 配置。
- SetConsoleCP(65001) 用于 UTF-8。在许多多字节字符场景中可能会失败,但对于 UTF-16LE 总是可以的
- 您需要在应用程序退出时恢复预览控制台代码页。
Windows console supports UNICODE with the ReadConsole and WriteConsole functions in UTF-16LE mode. Background effect - piping in this case will not work. I.e. myapp.exe >> ret.log brings to 0 byte ret.log file. If you are ok with this fact you can try my library as following.
Windows 控制台在 UTF-16LE 模式下通过 ReadConsole 和 WriteConsole 函数支持 UNICODE。背景效果 - 在这种情况下管道将不起作用。即 myapp.exe >> ret.log 使 ret.log 文件变为 0 字节。如果你对这个事实没意见,你可以尝试我的图书馆,如下所示。
##代码##Library will auto-convert your UTF-8 into UTF-16LE and write it into console using WriteConsole. As well as there are error and input streams. Another library benefit - colors.
库会自动将您的 UTF-8 转换为 UTF-16LE,并使用 WriteConsole 将其写入控制台。还有错误和输入流。另一个图书馆的好处 - 颜色。
Link on example app: https://github.com/incoder1/IO/tree/master/examples/iostreams
示例应用程序链接:https: //github.com/incoder1/IO/tree/master/examples/iostreams
The library homepage: https://github.com/incoder1/IO
图书馆主页:https: //github.com/incoder1/IO
回答by mr calendar
I had a similar problem, Output Unicode to console Using C++, in Windowscontains the gem that you need to do chcp 65001
in the console before running your program.
我有一个类似的问题,使用 C++ 将 Unicode 输出到控制台,在 Windows 中包含chcp 65001
运行程序之前需要在控制台中执行的 gem 。
There may be some way of doing this programatically, but I don't know what it is.
可能有一些以编程方式执行此操作的方法,但我不知道它是什么。