C# 如何验证字符串是英文的?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2266088/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-07 00:53:33  来源:igfitidea点击:

How do I verify that a string is in English?

c#stringcharacter-encoding

提问by basvas

I read a string from the console. How do I make sure it only contains English characters and digits?

我从控制台读取了一个字符串。我如何确保它只包含英文字符和数字?

采纳答案by LBushkin

Assuming that by "English characters" you are simply referring to the 26-character Latin alphabet, this would be an area where I would use regular expressions: ^[a-zA-Z0-9 ]*$

假设“英文字符”只是指 26 个字符的拉丁字母表,这将是我将使用正则表达式的区域: ^[a-zA-Z0-9 ]*$

For example:

例如:

if( Regex.IsMatch(Console.ReadLine(), "^[a-zA-Z0-9]*$") )
{ /* your code */ }

The benefit of regular expressions in this case is that all you really care about is whether or not a string matches a pattern - this is one where regular expressions work wonderfully. It clearly captures your intent, and it's easy to extend if you definition of "English characters" expands beyond just the 26 alphabetic ones.

在这种情况下,正则表达式的好处是您真正关心的是字符串是否与模式匹配——这是正则表达式非常有用的地方。它清楚地捕捉了您的意图,如果您对“英文字符”的定义超出了 26 个字母的定义,则很容易扩展。

There's a decent series of articles herethat teach more about regular expressions.

这里有一系列不错的文章,可以教授更多有关正则表达式的知识。

J?rn Schou-Rode's answer provides a great explanation of how the regular expression presented here works to match your input.

J?rn Schou-Rode 的回答很好地解释了此处提供的正则表达式如何与您的输入相匹配。

回答by J?rn Schou-Rode

You could match it against this regular expression: ^[a-zA-Z0-9]*$

您可以将其与此正则表达式进行匹配: ^[a-zA-Z0-9]*$

  • ^matches the start of the string (ie no characters are allowed before this point)
  • [a-zA-Z0-9]matches any letter from a-z in lower or upper case, as well as digits 0-9
  • *lets the previous match repeat zero or more times
  • $matches the end of the string (ie no characters are allowed after this point)
  • ^匹配字符串的开头(即在此之前不允许有字符)
  • [a-zA-Z0-9]匹配任何来自 az 的小写或大写字母,以及数字 0-9
  • *让之前的匹配重复零次或多次
  • $匹配字符串的结尾(即在此点之后不允许使用任何字符)

To use the expression in a C# program, you will need to import System.Text.RegularExpressionsand do something like this in your code:

要在 C# 程序中使用该表达式,您需要System.Text.RegularExpressions在代码中导入并执行以下操作:

bool match = Regex.IsMatch(input, "^[a-zA-Z0-9]*$");

If you are going to test a lot of lines against the pattern, you might want to compile the expression:

如果您要针对模式测试很多行,您可能需要编译表达式:

Regex pattern = new Regex("^[a-zA-Z0-9]*$", RegexOptions.Compiled);

for (int i = 0; i < 1000; i++)
{
    string input = Console.ReadLine();
    pattern.IsMatch(input);
}

回答by James Curran

bool AllAscii(string str)
{ 
   return !str.Any(c => !Char.IsLetterOrDigit(c));
}

回答by Andrey Shvydky

Something like this (if you want to control input):

像这样(如果你想控制输入):

static string ReadLettersAndDigits() {
    StringBuilder sb = new StringBuilder();
    ConsoleKeyInfo keyInfo;
    while ((keyInfo = Console.ReadKey(true)).Key != ConsoleKey.Enter) {
        char c = char.ToLower(keyInfo.KeyChar);
        if (('a' <= c && c <= 'z') || char.IsDigit(c)) {
            sb.Append(keyInfo.KeyChar);
            Console.Write(c);
        }
    }
    return sb.ToString();
}

回答by PurplePilot

do you have web access? i would assume that cannot be guaranteed, but Google has a language api that will detect the language you pass to it. google language api

你有网络访问权限吗?我认为不能保证,但谷歌有一个语言 api 可以检测你传递给它的语言。 谷歌语言api

回答by Bhaskar

If i dont wnat to use RegEx, and just to provide an alternate solution, you can just check the ASCII code of each character and if it lies between that range, it would either be a english letter or a number (This might not be the best solution):

如果我不想使用 RegEx,而只是为了提供替代解决方案,您可以检查每个字符的 ASCII 码,如果它位于该范围之间,则它可能是英文字母或数字(这可能不是最佳解决方案):

foreach (char ch in str.ToCharArray()) 
{ 
    int x = (int)char;
    if (x >= 63 and x <= 126) 
    {
       //this is english letter, i.e.- A, B, C, a, b, c...
    }
    else if(x >= 48 and x <= 57)
    {
       //this is number
    }
    else
    {
       //this is something diffrent
    }

} 

http://en.wikipedia.org/wiki/ASCIIfor full ASCII table.

http://en.wikipedia.org/wiki/ASCII获取完整的 ASCII 表。

But I still think, RegEx is the best solution.

但我仍然认为,RegEx 是最好的解决方案。

回答by Erik A. Brandstadmoen

I agree with the Regular Expression answers. However, you could simplify it to just "^[\w]+$". \w is any "word character" (which translates to [a-zA-Z_0-9] if you use a non-unicode alphabet. I don't know if you want underscores as well.

我同意正则表达式的答案。但是,您可以将其简化为“^[\w]+$”。\w 是任何“单词字符”(如果您使用非 unicode 字母,则转换为 [a-zA-Z_0-9]。我不知道您是否也需要下划线。

More on regexes in .net here: http://msdn.microsoft.com/en-us/library/ms972966.aspx#regexnet_topic8

更多关于 .net 中的正则表达式在这里:http: //msdn.microsoft.com/en-us/library/ms972966.aspx#regexnet_topic8

回答by Danny A

bool onlyEnglishCharacters = !EnglishText.Any(a => a > '~');

Seems cheap, but it worked for me, legit easy answer. Hope it helps anyone.

看起来很便宜,但它对我有用,合法简单的答案。希望它可以帮助任何人。

回答by Ivan I?in

As many pointed out, accepted answer works only if there is a single word in the string. As there are no answers that cover the case of multiple words or even sentences in the string, here is the code:

正如许多人指出的那样,只有当字符串中有一个单词时,接受的答案才有效。由于没有涵盖字符串中多个单词甚至句子的情况的答案,因此代码如下:

stringToCheck.Any(x=> char.IsLetter(x) && !((int)x >= 63 && (int)x <= 126));

回答by Sina Karvandi

One other way is to check if IsLower and IsUpper both doesn't return true. Something like :

另一种方法是检查 IsLower 和 IsUpper 是否都没有返回 true。就像是 :

    private bool IsAllCharEnglish(string Input)
    {
        foreach (var item in Input.ToCharArray())
        {
            if (!char.IsLower(item) && !char.IsUpper(item) && !char.IsDigit(item) && !char.IsWhiteSpace(item))
            {
                return false;
            }
        }
        return true;
    }

and for use it :

并使用它:

        string str = "????? abc";
        IsAllCharEnglish(str); // return false
        str = "These are english 123";
        IsAllCharEnglish(str); // return true