xcode 目标 c 不喜欢我的 unichars?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2151783/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Objective c doesn't like my unichars?
提问by corydoras
Xcode complaints about "multi-character character contant"'s when I try to do the following:
当我尝试执行以下操作时,Xcode 会抱怨“多字符字符常量”:
static unichar accent characters[] = { 'ā', 'á', '?', 'à' };
How do you make an array of characters, when not all of them are ascii? The following works just fine
当并非所有字符都是 ascii 时,如何制作字符数组?以下工作正常
static unichar accent[] = { 'a', 'b', 'c' };
Workaround
解决方法
The closest work around I have found is to convert the special characters into hex, ie this works:
我发现的最接近的工作是将特殊字符转换为十六进制,即这有效:
static unichar accent characters[] = { 0x0100, 0x0101, 0x0102 };
回答by Yuji
It's not that Objective-C doesn't like it, it's that C doesn't. The constant 'c'
is for char
which has 1 byte, not unichar
which has 2 bytes. (see the note below for a bit more detail.)
并不是Objective-C 不喜欢它,而是C 不喜欢它。常量'c'
是char
有 1 个字节的,而不是unichar
有 2 个字节的。(有关更多详细信息,请参阅下面的注释。)
There's no perfectly supported way to represent a unichar
constant. You can use
没有完全支持的方式来表示unichar
常量。您可以使用
char* s="ü";
in a UTF-8-encoded source file to get the unicode C-string, or
在 UTF-8 编码的源文件中获取 unicode C 字符串,或
NSString* s=@"ü";
in a UTF-8 encoded source file to get an NSString
. (This was not possible before 10.5. It's OK for iPhone.)
在 UTF-8 编码的源文件中获取NSString
. (这在 10.5 之前是不可能的。iPhone 没问题。)
NSString
itself is conceptually encoding-neutral; but if you want, you can get the unicode character by using -characterAtIndex:
.
NSString
本身在概念上是编码中立的;但如果你愿意,你可以使用-characterAtIndex:
.
Finally two comments:
最后两点意见:
If you just want to remove accents from the string, you can just use the method like this, without writing the table yourself:
-(NSString*)stringWithoutAccentsFromString:(NSString*)s { if (!s) return nil; NSMutableString *result = [NSMutableString stringWithString:s]; CFStringFold((CFMutableStringRef)result, kCFCompareDiacriticInsensitive, NULL); return result; }
See the document of CFStringFold.
- If you want unicode characters for localization/internationalization, you shouldn't embed the strings in the source code. Instead you should use
Localizable.strings
andNSLocalizedString
. See here.
如果您只想从字符串中删除重音符号,您可以使用这样的方法,而无需自己编写表格:
-(NSString*)stringWithoutAccentsFromString:(NSString*)s { if (!s) return nil; NSMutableString *result = [NSMutableString stringWithString:s]; CFStringFold((CFMutableStringRef)result, kCFCompareDiacriticInsensitive, NULL); return result; }
请参阅CFStringFold的文档。
- 如果您想要用于本地化/国际化的 unicode 字符,则不应在源代码中嵌入字符串。相反,您应该使用
Localizable.strings
和NSLocalizedString
。见这里。
Note:
For arcane historical reasons, 'a'
is an int
in C, see the discussions here. In C++, it's a char
. But it doesn't change the fact that writing more than one byte inside '...'
is implementation-defined and not recommended. For example, see ISO C Standard 6.4.4.10. However, it was common in classic Mac OS to write the four-letter code enclosed in single quotes, like 'APPL'
. But that's another story...
注意:由于神秘的历史原因,在 C 中'a'
是一个int
,请参阅此处的讨论。在 C++ 中,它是一个char
. 但这并没有改变这样一个事实,即在内部写入多个字节'...'
是实现定义的,不推荐使用。例如,参见ISO C 标准 6.4.4.10。然而,在经典的 Mac OS 中编写用单引号括起来的四字母代码是很常见的,比如'APPL'
. 但那是另一个故事了……
Another complication is that accented letters are not always represented by 1 byte; it depends on the encoding. In UTF-8, it's not. In ISO-8859-1, it is. And unichar
should be in UTF-16. Did you save your source code in UTF-16? I think the default of XCode is UTF-8. GCC might do some encoding conversion depending on the setup, too...
另一个复杂情况是重音字母并不总是由 1 个字节表示;这取决于编码。在 UTF-8 中,它不是。在 ISO-8859-1 中,它是。并且unichar
应该是UTF-16。您是否将源代码保存在 UTF-16 中?我认为 XCode 的默认值是 UTF-8。GCC 也可能会根据设置进行一些编码转换...
回答by daniel.gindi
Or you can just do it like this:
或者你可以这样做:
static unichar accent characters[] = { L'ā', L'á', L'?', L'à' };
L is a standard C keyword which says "I'm about to write a UNICODE character or character set".
L 是标准的 C 关键字,表示“我将要编写一个 UNICODE 字符或字符集”。
Works fine for Objective-C too.
也适用于 Objective-C。
Note: The compiler may give you a strange warning about too many characters put inside a unichar, but you can safely ignore that warning. Xcode just doesn't deal with the unicode characters the right way, but the compiler parses them properly and the result is OK.
注意:编译器可能会给您一个奇怪的警告,提示您在 unichar 中放入过多字符,但您可以放心地忽略该警告。Xcode 只是没有以正确的方式处理 unicode 字符,但是编译器正确地解析它们并且结果没问题。
回答by Matt Comi
Depending on your circumstances, this may be a tidy way to do it:
根据您的情况,这可能是一种整洁的方法:
NSCharacterSet* accents =
[NSCharacterSet characterSetWithCharactersInString:@"āá?à"];
And then, if you want to check if a given unichar is one of those accent characters:
然后,如果您想检查给定的 unichar 是否是这些重音字符之一:
if ([accents characterIsMember:someOtherUnichar])
{
}
NSString
also has many methods of its own for handling NSCharacterSet
objects.
NSString
也有许多自己的方法来处理NSCharacterSet
对象。