C++ TCHAR 仍然相关吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/234365/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is TCHAR still relevant?
提问by Fábio
I'm new to Windows programming and after reading the Petzold book I wonder:
我是 Windows 编程的新手,在阅读了 Petzold 的书后,我想知道:
is it still good practice to use the TCHAR
type and the _T()
function to declare strings or if I should just use the wchar_t
and L""
strings in new code?
使用TCHAR
类型和_T()
函数来声明字符串还是我应该只在新代码中使用wchar_t
和L""
字符串的好习惯?
I will target only Windows 2000 and up and my code will be i18nfrom the start up.
我将只针对 Windows 2000 及更高版本,并且我的代码从一开始就是i18n。
采纳答案by Nick
I would still use the TCHAR syntax if I was doing a new project today. There's not much practical difference between using it and the WCHAR syntax, and I prefer code which is explicit in what the character type is. Since most API functions and helper objects take/use TCHAR types (e.g.: CString), it just makes sense to use it. Plus it gives you flexibility if you decide to use the code in an ASCII app at some point, or if Windows ever evolves to Unicode32, etc.
如果我今天在做一个新项目,我仍然会使用 TCHAR 语法。使用它和 WCHAR 语法之间没有太大的实际区别,我更喜欢明确字符类型的代码。由于大多数 API 函数和辅助对象采用/使用 TCHAR 类型(例如:CString),因此使用它才有意义。此外,如果您决定在某个时候在 ASCII 应用程序中使用代码,或者如果 Windows 演变为 Unicode32 等,它会为您提供灵活性。
If you decide to go the WCHAR route, I would be explicit about it. That is, use CStringW instead of CString, and casting macros when converting to TCHAR (eg: CW2CT).
如果您决定走 WCHAR 路线,我会明确说明这一点。也就是说,使用 CStringW 而不是 CString,并在转换为 TCHAR 时转换宏(例如:CW2CT)。
That's my opinion, anyway.
无论如何,这就是我的意见。
回答by Sascha
The short answer: NO.
简短的回答: 不。
Like all the others already wrote, a lot of programmers still use TCHARs and the corresponding functions. In my humble opinion the whole concept was a bad idea. UTF-16string processing is a lot different than simple ASCII/MBCS string processing. If you use the same algorithms/functions with both of them (this is what the TCHAR idea is based on!), you get very bad performance on the UTF-16 version if you are doing a little bit more than simple string concatenation (like parsing etc.). The main reason are Surrogates.
就像其他人已经写过的一样,很多程序员仍然使用 TCHAR 和相应的函数。在我看来,整个概念是个坏主意。UTF-16字符串处理与简单的 ASCII/MBCS 字符串处理有很大不同。如果您对它们使用相同的算法/函数(这就是 TCHAR 想法的基础!),如果您所做的不仅仅是简单的字符串连接(例如解析等)。主要原因是代理人。
With the sole exception when you reallyhave to compile your application for a system which doesn't support Unicode I see no reason to use this baggage from the past in a new application.
唯一的例外是,当您确实必须为不支持 Unicode 的系统编译应用程序时,我认为没有理由在新应用程序中使用过去的这个包袱。
回答by dan04
I have to agree with Sascha. The underlying premise of TCHAR
/ _T()
/ etc. is that you can write an "ANSI"-based application and then magically give it Unicode support by defining a macro. But this is based on several bad assumptions:
我必须同意萨沙。的基本前提TCHAR
/ _T()
/等等是,你可以写一个“ANSI”为主的应用程序,然后奇迹般地通过定义一个宏给它的Unicode支持。但这是基于几个错误的假设:
That you actively build both MBCS and Unicode versions of your software
您积极构建软件的 MBCS 和 Unicode 版本
Otherwise, you willslip up and use ordinary char*
strings in many places.
否则,您会滑倒并char*
在许多地方使用普通字符串。
That you don't use non-ASCII backslash escapes in _T("...") literals
您不在 _T("...") 文字中使用非 ASCII 反斜杠转义
Unless your "ANSI" encoding happens to be ISO-8859-1, the resulting char*
and wchar_t*
literals won't represent the same characters.
除非您的“ANSI”编码恰好是 ISO-8859-1,否则结果char*
和wchar_t*
文字将不会表示相同的字符。
That UTF-16 strings are used just like "ANSI" strings
UTF-16 字符串就像“ANSI”字符串一样使用
They're not. Unicode introduces several concepts that don't exist in most legacy character encodings. Surrogates. Combining characters. Normalization. Conditional and language-sensitive casing rules.
他们不是。Unicode 引入了一些在大多数传统字符编码中不存在的概念。代理人。组合字符。正常化。条件和语言敏感的大小写规则。
And perhaps most importantly, the fact that UTF-16 is rarely saved on disk or sent over the Internet: UTF-8 tends to be preferred for external representation.
也许最重要的是,UTF-16 很少保存在磁盘上或通过 Internet 发送的事实:UTF-8 往往更适合用于外部表示。
That your application doesn't use the Internet
您的应用程序不使用 Internet
(Now, this may be a valid assumption for yoursoftware, but...)
(现在,这对于您的软件来说可能是一个有效的假设,但是......)
The web runs on UTF-8and a plethora of rarer encodings. The TCHAR
concept only recognizes two: "ANSI" (which can'tbe UTF-8) and "Unicode" (UTF-16). It may be useful for making your Windows API calls Unicode-aware, but it's damned useless for making your web and e-mail apps Unicode-aware.
网络在 UTF-8和大量稀有编码上运行。这个TCHAR
概念只识别两个:“ANSI”(不能是 UTF-8)和“Unicode”(UTF-16)。这对于使您的 Windows API 调用能够识别 Unicode 可能很有用,但对于使您的 Web 和电子邮件应用程序能够识别 Unicode 来说毫无用处。
That you use no non-Microsoft libraries
不使用非 Microsoft 库
Nobody else uses TCHAR
. Pocouses std::string
and UTF-8. SQLitehas UTF-8 and UTF-16 versions of its API, but no TCHAR
. TCHAR
isn't even in the standard library, so no std::tcout
unless you want to define it yourself.
没有其他人使用TCHAR
. Poco使用std::string
UTF-8。 SQLite有 UTF-8 和 UTF-16 版本的 API,但没有TCHAR
. TCHAR
甚至不在标准库中,所以std::tcout
除非您想自己定义它,否则不会。
What I recommend instead of TCHAR
我推荐什么而不是 TCHAR
Forget that "ANSI" encodings exist, except for when you need to read a file that isn't valid UTF-8. Forget about TCHAR
too. Always call the "W" version of Windows API functions. #define _UNICODE
just to make sure you don't accidentally call an "A" function.
忘记存在“ANSI”编码,除非您需要读取无效的 UTF-8 文件。也别想TCHAR
了。始终调用 Windows API 函数的“W”版本。 #define _UNICODE
只是为了确保您不会意外调用“A”函数。
Always use UTF encodings for strings: UTF-8 for char
strings and UTF-16 (on Windows) or UTF-32 (on Unix-like systems) for wchar_t
strings. typedef
UTF16
and UTF32
character types to avoid platform differences.
始终对字符串使用 UTF 编码:字符串使用 UTF-8,char
字符串使用 UTF-16(在 Windows 上)或 UTF-32(在类 Unix 系统上)wchar_t
。 typedef
UTF16
和UTF32
字符类型以避免平台差异。
回答by Aardvark
If you're wondering if it's still in practice, then yes - it is still used quite a bit. No one will look at your code funny if it uses TCHAR and _T(""). The project I'm working on now is converting from ANSI to unicode - and we're going the portable (TCHAR) route.
如果您想知道它是否仍在实践中,那么是的 - 它仍然被使用了很多。如果您的代码使用 TCHAR 和 _T(""),那么没有人会觉得您的代码很有趣。我现在正在从事的项目正在从 ANSI 转换为 unicode - 我们将采用可移植 (TCHAR) 路线。
However...
然而...
My vote would be to forget all the ANSI/UNICODE portable macros (TCHAR, _T(""), and all the _tXXXXXX calls, etc...) and just assume unicode everywhere. I really don't see the point of being portable if you'll never need an ANSI version. I would use all the wide character functions and types directly. Preprend all string literals with a L.
我的投票是忘记所有 ANSI/UNICODE 可移植宏(TCHAR、_T("") 和所有 _tXXXXXX 调用等),并假设到处都是 unicode。如果您永远不需要 ANSI 版本,我真的不认为可移植的意义。我会直接使用所有宽字符函数和类型。用 L 预先准备所有字符串文字。
回答by Steven
The Introduction to Windows Programming articleon MSDN says
在介绍了Windows编程的文章在MSDN上说:
New applications should always call the Unicode versions (of the API).
The TEXTand TCHARmacros are less useful today, because all applications should use Unicode.
新应用程序应始终调用(API 的)Unicode 版本。
该TEXT和TCHAR宏是用处不大的今天,因为所有的应用程序应该使用Unicode。
I would stick to wchar_t
and L""
.
我会坚持wchar_t
和L""
。
回答by Pavel Radzivilovsky
I would like to suggest a different approach (neither of the two).
我想建议一种不同的方法(两者都不是)。
To summarize, use char* and std::string, assuming UTF-8 encoding, and do the conversions to UTF-16 only when wrapping API functions.
总而言之,假设使用 UTF-8 编码,使用 char* 和 std::string,并且仅在包装 API 函数时才转换为 UTF-16。
More information and justification for this approach in Windows programs can be found in http://www.utf8everywhere.org.
有关在 Windows 程序中使用此方法的更多信息和理由,请访问 http://www.utf8everywhere.org。
回答by LeOpArD
TCHAR
/WCHAR
might be enough for some legacy projects. But for new applications, I would say NO.
TCHAR
/WCHAR
对于一些遗留项目来说可能就足够了。但是对于新的应用程序,我会说NO。
All these TCHAR
/WCHAR
stuff are there because of historical reasons. TCHAR
provides a seemly neat way (disguise) to switch between ANSI text encoding (MBCS) and Unicode text encoding (UTF-16). In the past, people did not have an understanding of the number of characters of all the languages in the world. They assumed 2 bytes were enough to represent all characters and thus having a fixed-length character encoding scheme using WCHAR
. However, this is no longer true after the release of Unicode 2.0 in 1996.
由于历史原因,所有这些TCHAR
/WCHAR
东西都在那里。TCHAR
提供了一种看似巧妙的方式(伪装)在 ANSI 文本编码 (MBCS) 和 Unicode 文本编码 (UTF-16) 之间切换。过去,人们并不了解世界上所有语言的字符数。他们假设 2 个字节足以表示所有字符,因此使用WCHAR
. 然而,这在1996 年Unicode 2.0 发布后就不再适用了。
That is to say:
No matter which you use in CHAR
/WCHAR
/TCHAR
, the text processing part in your program should be able to handle variable length charactersfor internationalization.
也就是说:您在使用无论CHAR
/ WCHAR
/TCHAR
在你的程序中的文本处理部分应该能够处理可变长度字符的国际化。
So you actually need to do more than choosing one from CHAR
/WCHAR
/TCHAR
for programming in Windows:
所以,你真的需要做更多的不是选择从一个CHAR
/ WCHAR
/TCHAR
在Windows编程:
- If your application is small and does not involve text processing (i.e. just passing around the text string as arguments), then stick with
WCHAR
. Since it is easier this way to work with WinAPI with Unicode support. - Otherwise, I would suggest using UTF-8 as internal encoding and store texts in char strings or std::string. And covert them to UTF-16 when calling WinAPI. UTF-8is now the dominant encoding and there are lots of handy libraries and tools to process UTF-8 strings.
- 如果您的应用程序很小并且不涉及文本处理(即只是将文本字符串作为参数传递),那么坚持使用
WCHAR
. 因为通过这种方式使用支持 Unicode 的 WinAPI 更容易。 - 否则,我建议使用 UTF-8 作为内部编码并将文本存储在字符字符串或 std::string 中。并在调用 WinAPI 时将它们转换为 UTF-16。UTF-8现在是主要的编码方式,并且有很多方便的库和工具来处理 UTF-8 字符串。
Check out this wonderful website for more in-depth reading: http://utf8everywhere.org/
查看这个精彩的网站以获得更深入的阅读:http: //utf8everywhere.org/
回答by Nik Reiman
Yes, absolutely; at least for the _T macro. I'm not so sure about the wide-character stuff, though.
是的,一点没错; 至少对于 _T 宏。不过,我对宽字符的东西不太确定。
The reason being is to better support WinCE or other non-standard Windows platforms. If you're 100% certain that your code will remain on NT, then you can probably just use regular C-string declarations. However, it's best to tend towards the more flexible approach, as it's much easier to #define that macro away on a non-windows platform in comparison to going through thousands of lines of code and adding it everywhere in case you need to port some library to windows mobile.
原因是为了更好地支持 WinCE 或其他非标准的 Windows 平台。如果您 100% 确定您的代码将保留在 NT 上,那么您可能只使用常规的 C 字符串声明。但是,最好倾向于更灵活的方法,因为与通过数千行代码并将其添加到任何地方以备需要移植某些库相比,在非 Windows 平台上#define 该宏要容易得多到 windows 手机。
回答by snemarch
IMHO, if there's TCHARs in your code, you're working at the wrong level of abstraction.
恕我直言,如果您的代码中有 TCHAR,那么您在错误的抽象级别上工作。
Use whateverstring type is most convenient for you when dealing with text processing - this will hopefully be something supporting unicode, but that's up to you. Do conversion at OS API boundaries as necessary.
使用任何文本处理问题时的字符串类型是您最方便的-希望这将是一些支持unicode的,但是这取决于你。根据需要在 OS API 边界进行转换。
When dealing with file paths, whip up your own custom type instead of using strings. This will allow you OS-independent path separators, will give you an easier interface to code against than manual string concatenation and splitting, and will be a lot easier to adapt to different OSes (ansi, ucs-2, utf-8, whatever).
在处理文件路径时,请使用您自己的自定义类型而不是使用字符串。这将允许您独立于操作系统的路径分隔符,将为您提供比手动字符串连接和拆分更容易的编码界面,并且更容易适应不同的操作系统(ansi、ucs-2、utf-8,等等) .
回答by Trololol
The only reasons I see to use anything other than the explicit WCHAR are portability and efficiency.
我认为使用显式 WCHAR 以外的任何内容的唯一原因是可移植性和效率。
If you want to make your final executable as small as possible use char.
如果您想让最终的可执行文件尽可能小,请使用 char。
If you don't care about RAM usage and want internationalization to be as easy as simple translation, use WCHAR.
如果您不关心 RAM 使用情况并希望国际化像简单的翻译一样简单,请使用 WCHAR。
If you want to make your code flexible, use TCHAR.
如果您想让您的代码灵活,请使用 TCHAR。
If you only plan on using the Latin characters, you might as well use the ASCII/MBCS strings so that your user does not need as much RAM.
如果您只打算使用拉丁字符,您不妨使用 ASCII/MBCS 字符串,这样您的用户就不需要那么多 RAM。
For people who are "i18n from the start up", save yourself the source code space and simply use all of the Unicode functions.
对于“从一开始就是 i18n”的人来说,节省源代码空间并简单地使用所有 Unicode 功能。