C# 字符串比较、.NET 和不间断空间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/701369/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 14:02:29  来源:igfitidea点击:

String Comparison, .NET and non breaking space

c#stringcharacter-encoding

提问by Mark

I have an app written in C# that does a lot of string comparison. The strings are pulled in from a variety of sources (including user input) and are then compared. However I'm running into problems when comparing space '32' to non-breaking space '160'. To the user they look the same and so they expect a match. But when the app does the compare, there is no match.

我有一个用 C# 编写的应用程序,可以进行很多字符串比较。从各种来源(包括用户输入)中提取字符串,然后进行比较。但是,在将空间“32”与不间断空间“160”进行比较时,我遇到了问题。对用户来说,它们看起来相同,因此他们期望匹配。但是当应用程序进行比较时,没有匹配项。

What is the best way to go about this? Am I going to have to go to all parts of the code that do a string compare and manually normalize non-breaking spaces to spaces? Does .NET offer anything to help with that? (I've tried all the compare options but none seem to help.)

解决这个问题的最佳方法是什么?我是否必须转到进行字符串比较并手动将不间断空格规范化为空格的代码的所有部分?.NET 是否提供任何帮助?(我已经尝试了所有比较选项,但似乎都没有帮助。)

It has been suggested that I normalize the strings upon receipt and then let the string compare method simply compare the normalized strings. I'm not sure it would be straight-forward to do that because what is a normalized string in the first place. What do I normalize it too? Sure, for now I can convert non-breaking spaces to breaking spaces. But what else can show up? Can there potentially be very many of these rules? Might they even be conflicting. (In one case I want to use a rule and in another I don't.)

有人建议我在收到时对字符串进行规范化,然后让字符串比较方法简单地比较规范化的字符串。我不确定这样做是否直接,因为首先什么是标准化字符串。我还要规范化什么?当然,现在我可以将非中断空格转换为中断空格。但是还有什么可以显示的?这些规则中可能有很多吗?他们甚至可能会发生冲突。(在一种情况下我想使用规则,而在另一种情况下我不想。)

采纳答案by John Kraft

If it were me, I would 'normalize' the strings as I 'pulled them in'; probably with a string.Replace(). Then you won't need to change your comparisons anywhere else.

如果是我,我会在“拉入”字符串时“标准化”这些字符串;可能带有 string.Replace()。这样您就无需在其他任何地方更改比较。

Edit: Mark, that's a tough one. Its really up to you, or you clients, as to what is a 'normalized' string. I've been in a similar situation where the customer demanded that strings like:

编辑:马克,这是一个艰难的。至于什么是“规范化”字符串,这真的取决于您或您的客户。我遇到过类似的情况,客户要求这样的字符串:

I have 4 apples.
I have four apples.

were actually equal. You may need separate normalizers for different situations. Either way, I would still do the normalization upon retrieval of the original strings.

实际上是平等的。您可能需要针对不同情况使用单独的规范化器。无论哪种方式,我仍然会在检索原始字符串时进行规范化。

回答by user127143

I went through lots of pain to find this simple answer. The code below uses a regular expression to replace non breaking spaces with normal spaces.

我经历了很多痛苦才找到这个简单的答案。下面的代码使用正则表达式用普通空格替换不间断空格。

string cellText = "String with non breaking spaces.";
cellText = Regex.Replace(cellText, @"\u00A0", " ");

Hope this helps, Dan

希望这会有所帮助,丹

回答by Mark Sowul

I'd suggest creating your own string comparer that extends one of the original ones -- do the "normalization" there (replace non-breaking space with regular space). In addition to the instance Equalsmethod, there's a static String.Equalsthat takes a comparer.

我建议创建您自己的字符串比较器来扩展原始字符串比较器之一 - 在那里进行“规范化”(用常规空格替换不间断空格)。除了实例Equals方法之外,还有一个静态方法String.Equals需要一个比较器。

回答by John

The same without regex, mostly for myself when I need it later:

没有正则表达式也是如此,主要是在我以后需要时为我自己:

text.Replace('\u00A0', ' ')

text.Replace('\u00A0', ' ')

回答by Gaurravs

It needs to be

它需要是

text.Replace('\u00A0',' ')

where \u00A0is non breaking space

\u00A0非中断空间在哪里

This will replace the non breaking space with normal space.

这将用正常空间替换非破坏空间。