如何从 Windows 上的 C++ 控制台应用程序打印 UTF-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1371012/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I print UTF-8 from c++ console application on Windows
提问by Paul Dixon
For a C++ console application compiled with Visual Studio 2008 on English Windows (XP,Vista or 7). Is it possible to print out to the console and correctly display UTF-8 encoded Japanese using cout or wcout?
对于在英文 Windows(XP、Vista 或 7)上使用 Visual Studio 2008 编译的 C++ 控制台应用程序。是否可以使用 cout 或 wcout 打印到控制台并正确显示 UTF-8 编码的日语?
回答by dtb
The Windows console uses the OEM code pageby default to display output.
默认情况下,Windows 控制台使用OEM 代码页来显示输出。
To change the code page to Unicode enter chcp 65001
in the console, or try to change the code page programmatically with SetConsoleOutputCP
.
要将代码页更改为 Unicode chcp 65001
,请在控制台中输入,或尝试使用SetConsoleOutputCP
.
Note that you probably have to change the font of the console to one that has glyphs in the unicode range.
请注意,您可能必须将控制台的字体更改为具有 unicode 范围内的字形的字体。
回答by sbi
Here's an article from MVP Michael Kaplanon how to correctly output UTF-16 through the console. You could convert your UTF-8 to UTF-16 and output that.
这是 MVP Michael Kaplan 的一篇关于如何通过控制台正确输出 UTF-16 的文章。您可以将 UTF-8 转换为 UTF-16 并输出。
回答by Slav
This should work:
这应该有效:
#include <cstdio>
#include <windows.h>
#pragma execution_character_set( "utf-8" )
int main()
{
SetConsoleOutputCP( 65001 );
printf( "Testing unicode -- English -- Ελληνικ? -- Espa?ol -- Русский. a?bcdefghijklmno?pqrs?tuüvwxyz\n" );
}
Don't know if it affects anything, but source file is saved as Unicode (UTF-8 with signature) - Codepage 65001at FILE-> Advanced Save Options ....
不知道这是否会影响什么,但源文件被保存为Unicode的(UTF-8具有签名) -代码页65001在FILE- >高级保存选项...。
Project-> Properties-> Configuration Properties-> General-> Character Setis set to Use Unicode Character Set.
Project-> Properties-> Configuration Properties-> General-> Character Set设置为Use Unicode Character Set。
Some say you need to change console font to Lucida Console, but on my side it is displayed with both Consolasand Lucida Console.
有人说您需要将控制台字体更改为Lucida Console,但在我这边,它与Consolas和Lucida Console 一起显示。
回答by ijprest
I've never actually tried setting the console code-page to UTF8 (not sure why it wouldn't work... the console can handle other multi-byte code-pages just fine), but there are a couple of functions to look up: SetConsoleCP and SetConsoleOutputCP.
我从来没有真正尝试过将控制台代码页设置为 UTF8(不知道为什么它不起作用......控制台可以很好地处理其他多字节代码页),但是有几个功能可以查看向上:SetConsoleCP 和 SetConsoleOutputCP。
You'll probably also need to make sure you're using a console font that is capable of displaying your characters. There's the SetCurrentConsoleFontExfunction, but it's only available on Vista and above.
您可能还需要确保使用能够显示字符的控制台字体。有SetCurrentConsoleFontEx函数,但它仅适用于 Vista 及更高版本。
Hope that helps.
希望有帮助。
回答by Cédric Fran?oys
Just for additional information:
仅供参考:
'ANSI' refers to windows-125x, used for win32 applications while 'OEM' refers to the code page used by console/MS-DOS applications.
Current active code-pages can be retrieved with functions GetOEMCP() and GetACP().
“ANSI”指的是 windows-125x,用于 win32 应用程序,而“OEM”指的是控制台/MS-DOS 应用程序使用的代码页。
可以使用函数 GetOEMCP() 和 GetACP() 检索当前活动的代码页。
In order to output something correctly to the console, you should:
为了向控制台正确输出内容,您应该:
ensure the current OEM code page supports the characters you want to output
(if necessary, use SetConsoleOutputCP to set it properly)convert the string from current ANSI code (win32) to the console OEM code page
确保当前 OEM 代码页支持您要输出的字符
(如有必要,请使用 SetConsoleOutputCP 进行正确设置)将字符串从当前 ANSI 代码 (win32) 转换为控制台 OEM 代码页
Here are some utilities for doing so:
以下是一些用于执行此操作的实用程序:
// Convert a UTF-16 string (16-bit) to an OEM string (8-bit)
#define UNICODEtoOEM(str) WCHARtoCHAR(str, CP_OEMCP)
// Convert an OEM string (8-bit) to a UTF-16 string (16-bit)
#define OEMtoUNICODE(str) CHARtoWCHAR(str, CP_OEMCP)
// Convert an ANSI string (8-bit) to a UTF-16 string (16-bit)
#define ANSItoUNICODE(str) CHARtoWCHAR(str, CP_ACP)
// Convert a UTF-16 string (16-bit) to an ANSI string (8-bit)
#define UNICODEtoANSI(str) WCHARtoCHAR(str, CP_ACP)
/* Convert a single/multi-byte string to a UTF-16 string (16-bit).
We take advantage of the MultiByteToWideChar function that allows to specify the charset of the input string.
*/
LPWSTR CHARtoWCHAR(LPSTR str, UINT codePage) {
size_t len = strlen(str) + 1;
int size_needed = MultiByteToWideChar(codePage, 0, str, len, NULL, 0);
LPWSTR wstr = (LPWSTR) LocalAlloc(LPTR, sizeof(WCHAR) * size_needed);
MultiByteToWideChar(codePage, 0, str, len, wstr, size_needed);
return wstr;
}
/* Convert a UTF-16 string (16-bit) to a single/multi-byte string.
We take advantage of the WideCharToMultiByte function that allows to specify the charset of the output string.
*/
LPSTR WCHARtoCHAR(LPWSTR wstr, UINT codePage) {
size_t len = wcslen(wstr) + 1;
int size_needed = WideCharToMultiByte(codePage, 0, wstr, len, NULL, 0, NULL, NULL);
LPSTR str = (LPSTR) LocalAlloc(LPTR, sizeof(CHAR) * size_needed );
WideCharToMultiByte(codePage, 0, wstr, len, str, size_needed, NULL, NULL);
return str;
}
回答by adspx5
On app start console set to default OEM437 CP. I was trying to output Unicode text to stdout, where console was switch to UTF8 translation _setmode(_fileno(stdout), _O_U8TEXT); and still had no luck on the screen even with Lucida TT font. If console was redirected to file, correct UTF8 file were created.
在应用程序启动控制台设置为默认 OEM437 CP。我试图将 Unicode 文本输出到 stdout,其中控制台切换到 UTF8 翻译 _setmode(_fileno(stdout), _O_U8TEXT); 即使使用 Lucida TT 字体,屏幕上仍然没有运气。如果控制台被重定向到文件,则创建了正确的 UTF8 文件。
Finally I was lucky. I have added single line "info.FontFamily = FF_DONTCARE;" and it is working now. Hope this help for you.
最后我很幸运。我添加了单行“info.FontFamily = FF_DONTCARE;” 它现在正在工作。希望这对你有帮助。
void SetLucidaFont()
{
HANDLE StdOut = GetStdHandle(STD_OUTPUT_HANDLE);
CONSOLE_FONT_INFOEX info;
memset(&info, 0, sizeof(CONSOLE_FONT_INFOEX));
info.cbSize = sizeof(CONSOLE_FONT_INFOEX); // prevents err=87 below
if (GetCurrentConsoleFontEx(StdOut, FALSE, &info))
{
info.FontFamily = FF_DONTCARE;
info.dwFontSize.X = 0; // leave X as zero
info.dwFontSize.Y = 14;
info.FontWeight = 400;
_tcscpy_s(info.FaceName, L"Lucida Console");
if (SetCurrentConsoleFontEx(StdOut, FALSE, &info))
{
}
}
}
回答by Alan Haggai Alavi
In the console, enter chcp 65001
to change the code page to that of UTF-8.
在控制台中,输入chcp 65001
将代码页更改为 UTF-8 的代码页。