xcode 目标 c 不喜欢我的 unichars?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2151783/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 19:05:11  来源:igfitidea点击:

Objective c doesn't like my unichars?

objective-cxcodegcc

提问by corydoras

Xcode complaints about "multi-character character contant"'s when I try to do the following:

当我尝试执行以下操作时,Xcode 会抱怨“多字符字符常量”:

static unichar accent characters[] = { 'ā', 'á', '?', 'à' };

How do you make an array of characters, when not all of them are ascii? The following works just fine

当并非所有字符都是 ascii 时,如何制作字符数组?以下工作正常

static unichar accent[] = { 'a', 'b', 'c' }; 

Workaround

解决方法

The closest work around I have found is to convert the special characters into hex, ie this works:

我发现的最接近的工作是将特殊字符转换为十六进制,即这有效:

static unichar accent characters[] = { 0x0100, 0x0101, 0x0102 };

回答by Yuji

It's not that Objective-C doesn't like it, it's that C doesn't. The constant 'c'is for charwhich has 1 byte, not unicharwhich has 2 bytes. (see the note below for a bit more detail.)

并不是Objective-C 不喜欢它,而是C 不喜欢它。常量'c'char有 1 个字节的,而不是unichar有 2 个字节的。(有关更多详细信息,请参阅下面的注释。)

There's no perfectly supported way to represent a unicharconstant. You can use

没有完全支持的方式来表示unichar常量。您可以使用

char* s="ü";

in a UTF-8-encoded source file to get the unicode C-string, or

在 UTF-8 编码的源文件中获取 unicode C 字符串,或

NSString* s=@"ü";

in a UTF-8 encoded source file to get an NSString. (This was not possible before 10.5. It's OK for iPhone.)

在 UTF-8 编码的源文件中获取NSString. (这在 10.5 之前是不可能的。iPhone 没问题。)

NSStringitself is conceptually encoding-neutral; but if you want, you can get the unicode character by using -characterAtIndex:.

NSString本身在概念上是编码中立的;但如果你愿意,你可以使用-characterAtIndex:.

Finally two comments:

最后两点意见:

  • If you just want to remove accents from the string, you can just use the method like this, without writing the table yourself:

    -(NSString*)stringWithoutAccentsFromString:(NSString*)s
    {
        if (!s) return nil;
        NSMutableString *result = [NSMutableString stringWithString:s];
        CFStringFold((CFMutableStringRef)result, kCFCompareDiacriticInsensitive, NULL);
        return result;
    }
    

    See the document of CFStringFold.

  • If you want unicode characters for localization/internationalization, you shouldn't embed the strings in the source code. Instead you should use Localizable.stringsand NSLocalizedString. See here.
  • 如果您只想从字符串中删除重音符号,您可以使用这样的方法,而无需自己编写表格:

    -(NSString*)stringWithoutAccentsFromString:(NSString*)s
    {
        if (!s) return nil;
        NSMutableString *result = [NSMutableString stringWithString:s];
        CFStringFold((CFMutableStringRef)result, kCFCompareDiacriticInsensitive, NULL);
        return result;
    }
    

    请参阅CFStringFold的文档。

  • 如果您想要用于本地化/国际化的 unicode 字符,则不应在源代码中嵌入字符串。相反,您应该使用Localizable.stringsNSLocalizedString。见这里

Note: For arcane historical reasons, 'a'is an intin C, see the discussions here. In C++, it's a char. But it doesn't change the fact that writing more than one byte inside '...'is implementation-defined and not recommended. For example, see ISO C Standard 6.4.4.10. However, it was common in classic Mac OS to write the four-letter code enclosed in single quotes, like 'APPL'. But that's another story...

注意:由于神秘的历史原因,在 C 中'a'是一个int,请参阅此处的讨论。在 C++ 中,它是一个char. 但这并没有改变这样一个事实,即在内部写入多个字节'...'是实现定义的,不推荐使用。例如,参见ISO C 标准 6.4.4.10。然而,在经典的 Mac OS 中编写用单引号括起来的四字母代码是很常见的,比如'APPL'. 但那是另一个故事了……

Another complication is that accented letters are not always represented by 1 byte; it depends on the encoding. In UTF-8, it's not. In ISO-8859-1, it is. And unicharshould be in UTF-16. Did you save your source code in UTF-16? I think the default of XCode is UTF-8. GCC might do some encoding conversion depending on the setup, too...

另一个复杂情况是重音字母并不总是由 1 个字节表示;这取决于编码。在 UTF-8 中,它不是。在 ISO-8859-1 中,它是。并且unichar应该是UTF-16。您是否将源代码保存在 UTF-16 中?我认为 XCode 的默认值是 UTF-8。GCC 也可能会根据设置进行一些编码转换...

回答by daniel.gindi

Or you can just do it like this:

或者你可以这样做:

static unichar accent characters[] = { L'ā', L'á', L'?', L'à' };

L is a standard C keyword which says "I'm about to write a UNICODE character or character set".

L 是标准的 C 关键字,表示“我将要编写一个 UNICODE 字符或字符集”。

Works fine for Objective-C too.

也适用于 Objective-C。

Note: The compiler may give you a strange warning about too many characters put inside a unichar, but you can safely ignore that warning. Xcode just doesn't deal with the unicode characters the right way, but the compiler parses them properly and the result is OK.

注意:编译器可能会给您一个奇怪的警告,提示您在 unichar 中放入过多字符,但您可以放心地忽略该警告。Xcode 只是没有以正确的方式处理 unicode 字符,但是编译器正确地解析它们并且结果没问题。

回答by Matt Comi

Depending on your circumstances, this may be a tidy way to do it:

根据您的情况,这可能是一种整洁的方法:

NSCharacterSet* accents = 
    [NSCharacterSet characterSetWithCharactersInString:@"āá?à"];

And then, if you want to check if a given unichar is one of those accent characters:

然后,如果您想检查给定的 unichar 是否是这些重音字符之一:

if ([accents characterIsMember:someOtherUnichar])
{
}

NSStringalso has many methods of its own for handling NSCharacterSetobjects.

NSString也有许多自己的方法来处理NSCharacterSet对象。