C ++中Trigraph序列的目的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1234582/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Purpose of Trigraph sequences in C++?
提问by Kirill V. Lyadvinsky
According to C++'03 Standard 2.3/1:
根据 C++'03 标准 2.3/1:
Before any other processing takes place, each occurrence of one of the following sequences of three characters (“trigraph sequences”) is replaced by the single character indicated in Table 1.
---------------------------------------------------------------------------- | trigraph | replacement | trigraph | replacement | trigraph | replacement | ---------------------------------------------------------------------------- | ??= | # | ??( | [ | ??< | { | | ??/ | \ | ??) | ] | ??> | } | | ??' | ? | ??! | | | ??- | ? | ----------------------------------------------------------------------------
在进行任何其他处理之前,以下三个字符序列(“三字符序列”)之一的每次出现都被表 1 中所示的单个字符替换。
---------------------------------------------------------------------------- | trigraph | replacement | trigraph | replacement | trigraph | replacement | ---------------------------------------------------------------------------- | ??= | # | ??( | [ | ??< | { | | ??/ | \ | ??) | ] | ??> | } | | ??' | ? | ??! | | | ??- | ? | ----------------------------------------------------------------------------
In real life that means that code printf( "What??!\n" );
will result in printing What|
because ??!
is a trigraph sequence that is replaced with the |
character.
在现实生活中,这意味着代码printf( "What??!\n" );
将导致打印,What|
因为??!
是一个用|
字符替换的三字符序列。
My question is what purpose of using trigraphs?Is there any practical advantage of using trigraphs?
我的问题是使用三合字母的目的是什么?使用三合字母有什么实际优势吗?
UPD: In answers was mentioned that some European keyboards don't have all the punctuation characters, so non-US programmers have to use trigraphs in everyday life?
UPD:在回答中提到一些欧洲键盘没有所有标点符号,所以非美国程序员在日常生活中必须使用三合字母?
UPD2: Visual Studio 2010 has trigraph support turned off by default.
UPD2:默认情况下,Visual Studio 2010 已关闭三字符支持。
采纳答案by Michael Burr
This question (about the closely related digraphs)has the answer.
这个问题(关于密切相关的有向图)有答案。
It boils down to the fact that the ISO 646 character set doesn't have all the characters of the C syntax, so there are some systems with keyboards and displays that can't deal with the characters (though I imagine that these are quite rare nowadays).
归结为 ISO 646 字符集没有 C 语法的所有字符,因此有些系统的键盘和显示器无法处理这些字符(尽管我认为这些非常罕见如今)。
In general, you don't need to use them, but you need to know about them for exactly the problem you ran into. Trigraphs are the reason the the '?
' character has an escape sequence:
通常,您不需要使用它们,但您需要准确了解它们以解决您遇到的问题。三字符是 ' ?
' 字符具有转义序列的原因:
'\?'
So a couple ways you can avoid your example problem are:
因此,可以避免示例问题的几种方法是:
printf( "What?\?!\n" );
printf( "What?" "?!\n" );
But you have to remember when you're typing the two '?' characters that you might be starting a trigraph (and it's certainly never something I'm thinking about).
但是您必须记住何时输入两个“?” 字符,你可能会开始一个三合字母(当然我从来没有考虑过)。
In practice, trigraphs and digraphs are something I don't worry about at all on a day-to-day basis. But you should be aware of them because once every couple years you'll run into a bug related to them (and you'll spend the rest of the day cursing their existance). It would be nice if compilers could be configured to warn (or error) when it comes across a trigraph or digraph, so I could know I've got something I should knowingly deal with.
在实践中,三合字母和二合字母是我日常根本不担心的事情。但是您应该意识到它们,因为每隔几年您就会遇到与它们相关的错误(并且您将花费一天的剩余时间来诅咒它们的存在)。如果编译器可以配置为在遇到三合字母或二合字母时发出警告(或错误),那就太好了,这样我就知道我有一些我应该有意识地处理的事情。
And just for completeness, digraphs are much less dangerous since they get processed as tokens, so a digraph inside a string literal won't get interpreted as a digraph.
为了完整起见,有向图的危险性要小得多,因为它们被作为标记处理,因此字符串文字中的有向图不会被解释为有向图。
For a nice education on various fun with punctuation in C/C++ programs (including a trigraph bug that would defintinely have me pulling my hair out), take a look at Herb Sutter's GOTW #86 article.
要获得有关 C/C++ 程序中标点符号的各种乐趣的良好教育(包括一个绝对会让我抓狂的三字母错误),请查看Herb Sutter 的 GOTW #86 文章。
Addendum:
附录:
It looks like GCC will not process (and will warn about) trigraphs by default. Some other compilers have options to turn off trigraph support (IBM's for example). Microsoft started supporting a warning (C4837) in VS2008 that must be explicitly enabled (using -Wall or something).
默认情况下,GCC 似乎不会处理(并且会警告)三合字母。其他一些编译器可以选择关闭三合字母支持(例如 IBM 的)。Microsoft 开始支持 VS2008 中的警告(C4837),必须显式启用(使用 -Wall 或其他)。
回答by Roboprog
Kids today! :-)
孩子们今天!:-)
Yes, foreign equipment, such as an IBM 3270 terminal. The 3270 has, if I remember, no curly braces! If you wanted to write C on an IBM mini / mainframe, you had touse the wretched trigraphs for every block boundary. Fortunately, I only had to write software in C to emulatesome IBM minicomputer facilities, not actually write C software onthe System/36.
是的,国外设备,例如IBM 3270 终端。如果我记得的话,3270 没有花括号!如果您想在 IBM 小型机/大型机上编写 C,则必须为每个块边界使用糟糕的三字母组合。幸运的是,我只需要用C 编写软件来模拟一些 IBM 小型机设施,而不是真正在 System/36上编写 C 软件。
Look next to the "P" key:
查看“P”键旁边的:
Hmmm. Hard to tell. There is an extra button next to "carriage return", and I might have it backwards: maybe it was the "[" / "]" pair that was missing. At any rate, this keyboard would cause you grief if you had to write C.
嗯。很难说。“回车”旁边有一个额外的按钮,我可能把它倒过来了:也许是缺少了“[”/“]”对。无论如何,如果你不得不写 C,这个键盘会让你伤心。
Also, these terminals display EBCDIC, IBM's "native" mainframe character set, not ASCII (thanks, Pavel Minaev, for the reminder).
此外,这些终端显示 EBCDIC,IBM 的“本地”大型机字符集,而不是 ASCII(感谢 Pavel Minaev 的提醒)。
On the other hand, like the GNU C guide says: "You don't need this brain damage." The gcc compiler leaves this "feature" disabled by default.
另一方面,就像 GNU C 指南所说:“你不需要这种脑损伤。” gcc 编译器默认禁用此“功能”。
回答by Rob
From The C++ Programming Language
Special Edition, page 829
来自The C++ Programming Language
特别版,第 829 页
The ASCII special characters
[
,]
,{
,}
,|
, and\
occupy character set positions designated as alphabetic by ISO. In most European national ISO-646 character sets, these positions are occupied by letters not found in the English alphabet.A set of trigraphs is provided to allow national characters to be expressed in a portable way using a truly standard minimal character set. This can be useful for interchange of programs, but it doesn't make it easier for people to read programs. Naturally, the long-term solution to this problem is for C++ programmers to get equipment that supports both their native language and C++ well. Unfortunately, this appears to be infeasible for some, and the introduction of new equipment can be a frustratingly slow process.
ASCII 特殊字符
[
,]
,{
,}
,|
, 和\
占据由 ISO 指定为字母的字符集位置。在大多数欧洲国家 ISO-646 字符集中,这些位置被英文字母表中没有的字母占据。提供了一组三合字母,允许使用真正标准的最小字符集以可移植的方式表达国家字符。这对于程序的交换很有用,但它不会使人们更容易阅读程序。自然,这个问题的长期解决方案是让 C++ 程序员获得能够很好地支持他们的母语和 C++ 的设备。不幸的是,这对某些人来说似乎不可行,而且引入新设备可能是一个令人沮丧的缓慢过程。
回答by CB Bailey
They are for use on systems that lack some of the characters in C++'s basic character set. Needless to say, such systems are exceedingly rare.
它们用于缺少 C++ 基本字符集中某些字符的系统。不用说,这样的系统非常罕见。
回答by Pavel Minaev
Trigraphs have been proposed for removal in C++0x. That said, there still seems to be strong argument in support of them - see C++ committee paper N2910which discusses this. Apparently, EBCDIC is one major stronghold where they are needed.
已经建议在 C++0x 中删除三字母。也就是说,似乎仍然有强有力的论据支持它们 - 请参阅讨论这一点的C++ 委员会论文N2910。显然,EBCDIC 是需要它们的主要据点之一。
回答by Kelly S. French
I've seen trigraphs used in the early '90s to help convert PL/1 programs from a mainframe to be run/compiled/debugged on a PC.
我见过在 90 年代初期使用的三合字母来帮助将 PL/1 程序从大型机转换为在 PC 上运行/编译/调试。
They were dabbling with editing PL/I on the PC using a PL/I to C compiler and they wanted the code to work when moved back to the mainframe which did not support curly braces. I suggested that they could use macros like
他们正在尝试使用 PL/I 到 C 编译器在 PC 上编辑 PL/I,并且他们希望代码在移回不支持花括号的大型机时能够工作。我建议他们可以使用宏
#def BEGIN {
#def END }
or as a friendlier PL/I alternative
或者作为更友好的 PL/I 替代品
#def BEGIN ??<
#def END ??>
and if they really wanted to get fancy they could try
如果他们真的想变得花哨,他们可以尝试
#ifdef MAINFRAME
#def BEGIN ??<
#def END ??>
#else
#def BEGIN {
#def END }
#endif
and then the program would look like it was written in Pascal. They just looked at me funny and wouldn't speak to me for the rest of the day. I don't think I blame them. :)
然后程序看起来就像是用 Pascal 编写的。他们只是有趣地看着我,一整天都没有和我说话。我不认为我责怪他们。:)
What killed the effort what not the tri-graphs, it was the IO system differences between the platforms. Opening files on the PC was so much different than the mainframe it would have introduced way too many kludges to keep the same code running on both.
杀死努力的不是三图,而是平台之间的 IO 系统差异。在 PC 上打开文件与在大型机上有很大不同,它会引入太多的杂物,无法在两者上运行相同的代码。
回答by Jonathan Leffler
Primarily because the C standard introduced them back in 1989, when there were issues with the presence of the characters that trigraphs map to on some machines. By the time the C++ standard was published in 1998, the need for trigraphs was not great. They are a wart on C; they are just as much a wart on C++. There was a need for them - especially outside the English-speaking world - which is why they were added to C.
主要是因为 C 标准在 1989 年引入了它们,当时在某些机器上存在三字母映射到的字符存在问题。到 1998 年 C++ 标准发布时,对三合字母的需求并不大。它们是 C 上的一个疣;它们与 C++ 一样多。需要它们——尤其是在英语世界之外——这就是为什么它们被添加到 C 中的原因。
回答by Ned Batchelder
Some European keyboards don't (didn't?) have all the punctuation characters that US keyboards had, because they needed the keys for their unusual alphabetic characters. So for example (making this up), the Swedish keyboard would have A-ring where the curly brace was.
一些欧洲键盘没有(不是?)拥有美国键盘所具有的所有标点符号,因为它们需要用于不寻常的字母字符的键。因此,例如(编造),瑞典语键盘在花括号所在的位置会有 A 形环。
To accommodate those users, trigraphs are a way to enter punctuation using only the most common ASCII characters.
为了适应这些用户,三合字母是一种仅使用最常见的 ASCII 字符输入标点符号的方法。
回答by sbi
They are there mostly for historical reasons. Nowadays, most modern keyboards for most languages allow access to all those characters, but this used to be a problem once with some European keyboards. This is why trigraphs were invented.
他们在那里主要是出于历史原因。如今,大多数语言的大多数现代键盘都允许访问所有这些字符,但这曾经是一些欧洲键盘的问题。这就是发明三合字母的原因。
If you don't know what they're for, you shouldn't use them.
如果您不知道它们的用途,则不应使用它们。
It's still good to be aware of them, though, since you might accidentally and unintentionally use one in your code.
不过,了解它们仍然是件好事,因为您可能会不小心或无意地在代码中使用它们。