SQL Kanatype Sensitive KS 和宽度敏感是什么意思
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7489257/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
what is the meaning of Kanatype Sensitive KS and width sensitive
提问by Islam
When creating new database I had to set the collation type or set its default....fine.
创建新数据库时,我必须设置排序规则类型或设置其默认值....很好。
But actually I need to know what does Kanatype Sensitive(KS) and width sensitive means, its known for me that for example the case sensitive means that the letters are sensitive to upper and lower cases what about the Kanatype Sensitive and width sensitive??
但实际上我需要知道 Kanatype Sensitive(KS) 和宽度敏感是什么意思,我知道例如区分大小写意味着字母对大写和小写敏感,那么 Kanatype Sensitive 和宽度敏感呢?
回答by WPFNewbie
Both have to do with sorting and typically you would not select these two options. Here is a description couresty of Microsoft.
两者都与排序有关,通常您不会选择这两个选项。这是微软的描述礼貌。
Kanatype Sensitive
假名敏感
Distinguishes between the two types of Japanese kana characters: Hiragana and Katakana.
If this option is not selected, SQL Server considers Hiragana and Katakana characters to be equal for sorting purposes
区分两种类型的日语假名字符:平假名和片假名。
如果未选择此选项,SQL Server 将在排序时将平假名和片假名字符视为相等
Width Sensitive
宽度敏感
Distinguishes between a single-byte character and the same character when represented as a double-byte character.
If this option is not selected, SQL Server considers the single-byte and double-byte representation of the same character to be identical for sorting purposes.
区分单字节字符和表示为双字节字符的相同字符。
如果未选择此选项,SQL Server 会将同一字符的单字节和双字节表示视为相同以进行排序。
回答by sadiq
TL;DR:
特尔;博士:
Kanatype insensitivity makes sorting Japanese text more intuitive and should generally always be enabled unless you have a reason not to.
假名不敏感使日语文本的排序更加直观,通常应始终启用,除非您有理由不这样做。
FULL EXPLANATION:
完整说明:
In general, if you're storing any Japanese text that needs to be sorted, you probably want to go with Kanatype insensitive. Why would you want it like this? Because it makes sorting more intuitive in terms of Japanese language.
一般来说,如果您要存储任何需要排序的日语文本,您可能希望使用 Kanatype insensitive。你为什么要这样?因为它使日语的排序更加直观。
In english, since we have only one writing system, it's easy to sort things algorithmically. We simply order the characters by their character codes (already in alphabetical order) and we're done. In Japanese, though, because there are multiple ways to write out equivalent sounds, sorting can get a bit tricky. Hiragana and Katakana alphabets are separated into separate Unicode blocks, so when we try sorting things with "Kanatype sensitivity", we end up with results that aren't completely intuitive.
在英语中,由于我们只有一个书写系统,因此很容易通过算法对事物进行排序。我们只需按字符代码(已按字母顺序)对字符进行排序,就完成了。但是,在日语中,由于有多种方法可以写出等效的声音,因此排序可能会有些棘手。平假名和片假名字母被分成单独的 Unicode 块,因此当我们尝试使用“假名敏感性”对事物进行排序时,我们最终得到的结果并不完全直观。
Imagine you had a list of names that you wanted to sort:
想象一下,您有一个要排序的名称列表:
{ "ピカチュウ","さとし","マリオ","まちだ","はるか" }
{ "ピカチュウ","さとし","マリオ","まちだ","はるか" }
The romanized equivalent to the list is:
该列表的罗马化等效项是:
{ "Pikachu","Satoshi","Mario","Machida","Haruka" }
{“皮卡丘”、“小智”、“马里奥”、“町田”、“遥”}
When sorted kanatype sensitive, you would get the following result:
当对假名敏感排序时,您将得到以下结果:
{ "さとし","はるか","まちだ","ピカチュウ","マリオ" }
{ "さとし","はるか","まちだ","ピカチュウ","マリオ" }
{ "Satoshi","Haruka","Machida","Pikachu","Mario" }
{“聪”、“遥”、“町田”、“皮卡丘”、“马里奥”}
When sorted kanatype insensitive, you would get this result instead:
当排序假名不敏感时,你会得到这个结果:
{ "さとし","はるか","ピカチュウ","まちだ","マリオ" }
{ "さとし","はるか","ピカチュウ","まちだ","マリオ" }
{ "Satoshi","Haruka","Pikachu","Machida","Mario" }
{“小智”、“遥”、“皮卡丘”、“町田”、“马里奥”}
To Japanese speakers, the second sort is a lot more intuitive, as the results are actually sorted phonetically instead of based on character sets. "まちだ" and "マリオ" both start with the same phonetic sound, but because one uses hiragana "ma" and the other uses katakana "ma", they are separated when kanatype sensitivity is enabled. With kanatype insensitivity, the list can be properly sorted so that the two words appear next to each other on the list despite their writing system differences.
对于说日语的人来说,第二种排序更直观,因为结果实际上是按语音排序而不是基于字符集。“まちだ”和“マリオ”都以相同的音标开头,但是因为一个使用平假名“ma”而另一个使用片假名“ma”,所以当启用假名敏感性时,它们是分开的。由于假名不敏感,可以对列表进行正确排序,以便尽管书写系统不同,但两个单词在列表中彼此相邻。
A good analogy for English language would be case-sensitivity. Imagine if you wanted to sort a list of words for a dictionary, some of them proper nouns while others are not:
英语语言的一个很好的类比是区分大小写。想象一下,如果您想对字典的单词列表进行排序,其中一些是专有名词,而另一些则不是:
{"New York","new","jet","Japan","squirm","SQL"}
{“纽约”、“新”、“喷气”、“日本”、“蠕动”、“SQL”}
If we ignored the fact that uppercase and lowercase letters represent the same letter and just sort based on character code, we would get something like this:
如果我们忽略大写和小写字母代表同一个字母的事实,而只是根据字符代码进行排序,我们会得到这样的结果:
{"Japan", "New York", "SQL", "jet", "new", "squirm"}
{“日本”、“纽约”、“SQL”、“喷气”、“新”、“蠕动”}
A dictionary sorted like this would hardly be useful, especially if we wanted to look up a word without knowing whether it started with an uppercase or lowercase letter. We'd have to check the first part of the dictionary with all the proper nouns before checking the last part with all other words.
像这样排序的字典几乎没有用,特别是如果我们想在不知道以大写字母还是小写字母开头的情况下查找单词。在用所有其他单词检查最后一部分之前,我们必须用所有专有名词检查字典的第一部分。
If we ran a case insensitive sort that treat "A" and "a" as the same letter despite having separate character codes. We would get a result that is much more intuitive:
如果我们运行不区分大小写的排序,尽管具有单独的字符代码,但将“A”和“a”视为同一个字母。我们会得到一个更直观的结果:
{"Japan","jet","new","New York","squirm","SQL"}
{"Japan","jet","new","New York","squirm","SQL"}
So in general, unless you have a specific reason not to, you should always disable kanatype sensitivity. A phonebook-lookup would be kanatype sensitive. Note that in Japanese there is also an additional character type, Kanji, that you would also need to work with. Kanji is much harder to sort, as there are almost always multiple ways to read each Kanji and no real "alphabetical" order to the Kanji. Most forms intended for Japanese people usually have two fields for names: the user's name as it is normally written out, and the user's name completely written out in katakana. Not only does this let people know how to correctly pronounce a name which might be ambiguous written solely in Kanji, but it allows software to sort by the unambiguous katakana-only field, making the sort kanatype insensitive.
所以一般来说,除非你有特定的理由不这样做,否则你应该始终禁用假名敏感性。电话簿查找将是假名敏感的。请注意,在日语中还有一种额外的字符类型 Kanji,您也需要使用它。汉字很难排序,因为几乎总是有多种方法可以阅读每个汉字,而且汉字没有真正的“字母”顺序。大多数供日本人使用的表单通常有两个名称字段:通常写出的用户名,以及完全用片假名写出的用户名。这不仅让人们知道如何正确发音一个可能只用汉字书写的歧义的名字,而且它允许软件根据明确的片假名字段进行排序,使排序假名不敏感。
For more information, I definitely recommend checking out this excellent article, which explains the issues with sorting in Japanese much better than I can.
有关更多信息,我绝对建议您查看这篇出色的文章,它比我能更好地解释日语排序的问题。
Reference: https://japanese.stackexchange.com/questions/29612/what-do-you-need-kanatype-sensitivity-for
参考:https: //japanese.stackexchange.com/questions/29612/what-do-you-need-kanatype-sensivity-for