C# 如何获得正则表达式来检查字符串是否仅包含字母字符 [az] 或 [AZ]?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/990364/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:03:15  来源:igfitidea点击:

How can I get a regex to check that a string only contains alpha characters [a-z] or [A-Z]?

c#asp.netregexverification

提问by

I'm trying to create a regex to verify that a given string only has alpha characters a-z or A-Z. The string can be up to 25 letters long. (I'm not sure if regex can check length of strings)

我正在尝试创建一个正则表达式来验证给定的字符串是否只有字母字符 az 或 AZ。字符串最长可达 25 个字母。(我不确定正则表达式是否可以检查字符串的长度)

Examples:
1."abcdef" = true;
2."a2bdef" = false;
3."333" = false;
4."j" = true;
5."aaaaaaaaaaaaaaaaaaaaaaaaaa" = false;//26 letters

例子:
1. "abcdef" = true;
2."a2bdef" = false;
3. "333" = false;
4. "j" = true;
5."aaaaaaaaaaaaaaaaaaaaaaaaaa" = false;//26 个字母

Here is what I have so far... can't figure out what's wrong with it though

这是我到目前为止所拥有的......虽然无法弄清楚它有什么问题

Regex alphaPattern = new Regex("[^a-z]|[^A-Z]");

Regex alphaPattern = new Regex("[^a-z]|[^A-Z]");

I would think that would mean that the string could contain only upper or lower case letters from a-z, but when I match it to a string with all letters it returns false...

我认为这意味着该字符串只能包含来自 az 的大写或小写字母,但是当我将它与包含所有字母的字符串匹配时,它返回 false ...

Also, any suggestions regarding efficiency of using regex vs. other verifying methods would be greatly appreciated.

此外,任何有关使用正则表达式与其他验证方法的效率的建议将不胜感激。

回答by Blixt

Regex lettersOnly = new Regex("^[a-zA-Z]{1,25}$");
  • ^means "begin matching at start of string"
  • [a-zA-Z]means "match lower case and upper case letters a-z"
  • {1,25}means "match the previous item (the character class, see above) 1 to 25 times"
  • $means "only match if cursor is at end of string"
  • ^表示“在字符串开头开始匹配”
  • [a-zA-Z]意思是“匹配小写和大写字母 az”
  • {1,25}表示“匹配前一项(字符类,见上文)1 到 25 次”
  • $表示“仅当光标位于字符串末尾时才匹配”

回答by Svante

Do I understand correctly that it can only contain eitheruppercase orlowercase letters?

难道我理解正确的话,它只能包含任何大写小写字母?

new Regex("^([a-z]{1,25}|[A-Z]{1,25})$")

A regular expression seems to be the right thing to use for this case.

对于这种情况,正则表达式似乎是正确的选择。

By the way, the caret ("^") at the first place inside a character class means "not", so your "[^a-z]|[^A-Z]" would mean "not any lowercase letter, or not any uppercase letter" (disregarding that a-z are not all letters).

顺便说一句,字符类中第一个位置的插入符号(“^”)表示“不是”,因此您的“ [^a-z]|[^A-Z]”将表示“不是任何小写字母,也不是任何大写字母”(不管 az 不是所有字母)。

回答by Gumbo

The regular expression you are using is an alternation of [^a-z]and [^A-Z]. And the expressions [^…]mean to match any character other than those described in the character set.

您使用的正则表达式是[^a-z]and的替代[^A-Z]。并且表达式的[^…]意思是匹配字符集中描述的字符以外的任何字符。

So overall your expression means to match either any single character other than a-zor other than A-Z.

因此,总体而言,您的表达式意味着匹配a-zA-Z.

But you rather need a regular expression that matches a-zA-Zonly:

但是您更需要一个a-zA-Z仅匹配的正则表达式:

[a-zA-Z]

And to specify the length of that, anchor the expression with the start (^) and end ($) of the string and describe the length with the {n,m}quantifier, meaning at least nbut not more than mrepetitions:

并指定它的长度,用字符串的开始 ( ^) 和结束 ( $)锚定表达式,并用量词描述长度,意思是至少但不超过重复:{n,m}nm

^[a-zA-Z]{0,25}$

回答by Rene Saarsoo

The string can be up to 25 letters long. (I'm not sure if regex can check length of strings)

字符串最长可达 25 个字母。(我不确定正则表达式是否可以检查字符串的长度)

Regexes ceartanly can check length of a string - as can be seen from the answers posted by others.

Regexes ceartanly 可以检查字符串的长度 - 从其他人发布的答案中可以看出。

However, when you are validating a user input (say, a username), I would advise doing that check separately.

但是,当您验证用户输入(例如用户名)时,我建议您单独进行检查。

The problem is, that regex can only tell you if a string matched it or not. It won't tell why it didn't match. Was the text too long or did it contain unallowed characters - you can't tell. It's far from friendly, when a program says: "The supplied username contained invalid characters or was too long". Instead you should provide separate error messages for different situations.

问题是,正则表达式只能告诉你一个字符串是否匹配。它不会告诉为什么它不匹配。是文本太长还是包含不允许的字符 - 您无法分辨。当程序说:“提供的用户名包含无效字符或太长”时,这远非友好。相反,您应该为不同的情况提供单独的错误消息。

回答by zainnab

I'm trying to create a regex to verify that a given string only has alpha characters a-z or A-Z.

我正在尝试创建一个正则表达式来验证给定的字符串是否只有字母字符 az 或 AZ。

Easily done as many of the others have indicated using what are known as "character classes". Essentially, these allow us to specifiy a range of values to use for matching: (NOTE: for simplification, I am assuming implict ^ and $ anchors which are explained later in this post)

正如许多其他人所指出的那样,使用所谓的“字符类”很容易完成。本质上,这些允许我们指定用于匹配的值范围:(注意:为了简化,我假设隐含 ^ 和 $ 锚点,这将在本文后面解释)

[a-z]Match any single lower-case letter.
ex: a matches, 8 doesn't match

[az]匹配任何单个小写字母。
例如:a 匹配,8 不匹配

[A-Z]Match any single upper-case letter.
ex: A matches, a doesn't match

[AZ]匹配任何单个大写字母。
例如:A 匹配,a 不匹配

[0-9]Match any single digit zero to nine
ex: 8 matches, a doesn't match

[0-9]匹配任何单个数字零到九
例如:8 个匹配,a 不匹配

[aeiou]Match only on a or e or i or o or u. ex: o matches, z doesn't match

[aeiou]仅匹配 a 或 e 或 i 或 o 或 u。例如:o 匹配,z 不匹配

[a-zA-Z]Match any single lower-case OR upper-case letter. ex: A matches, a matches, 3 doesn't match

[a-zA-Z]匹配任何单个小写或大写字母。例如:A 匹配,a 匹配,3 不匹配

These can, naturally, be negated as well: [^a-z]Match anything that is NOT an lower-case letter ex: 5 matches, A matches, a doesn't match

当然,这些也可以被否定: [^az]匹配任何不是小写字母的东西,例如:5 个匹配,A 匹配,a 不匹配

[^A-Z]Match anything that is NOT an upper-case letter ex: 5 matches, A doesn't matche, a matches

[^AZ]匹配任何不是大写字母的东西,例如:5 个匹配,A 不匹配,a 匹配

[^0-9]Match anything that is NOT a number ex: 5 doesn't match, A matches, a matches

[^0-9]匹配任何不是数字的东西,例如:5 不匹配,A 匹配,a 匹配

[^Aa69]Match anything as long as it is not A or a or 6 or 9 ex: 5 matches, A doesn't match, a doesn't match, 3 matches

[^Aa69]匹配任何东西,只要不是 A 或 a 或 6 或 9 例如:5 匹配,A 不匹配,a 不匹配,3 匹配

To see some common character classes, go to: http://www.regular-expressions.info/reference.html

要查看一些常见的字符类,请访问:http: //www.regular-expressions.info/reference.html

The string can be up to 25 letters long. (I'm not sure if regex can check length of strings)

字符串最长可达 25 个字母。(我不确定正则表达式是否可以检查字符串的长度)

You can absolutely check "length" but not in the way you might imagine. We measure repetition, NOT length strictly speaking using {}:

您绝对可以检查“长度”,但不能以您想象的方式检查。我们使用 {} 测量重复,严格来说不是长度:

a{2}Match two a's together.
ex: a doesn't match, aa matches, aca doesn't match

a{2} 将两个 a 匹配在一起。
例如:a 不匹配,aa 匹配,aca 不匹配

4{3}Match three 4's together. ex: 4 doesn't match, 44 doesn't match, 444 matches, 4434 doesn't match

4{3}匹配三个 4。例如:4 不匹配,44 不匹配,444 匹配,4434 不匹配

Repetition has values we can set to have lower and upper limits:

重复具有我们可以设置的下限和上限的值:

a{2,}Match on two or more a's together. ex: a doesn't match, aa matches, aaa matches, aba doesn't match, aaaaaaaaa matches

a{2,}匹配两个或多个 a。例如:a 不匹配、aa 匹配、aaa 匹配、aba 不匹配、aaaaaaaaa 匹配

a{2,5}Match on two to five a's together. ex: a doesn't match, aa matches, aaa matches, aba doesn't match, aaaaaaaaa doesn't match

a{2,5}匹配两到五个 a。例如:a 不匹配、aa 匹配、aaa 匹配、aba 不匹配、aaaaaaaaa 不匹配

Repetition extends to character classes, so: [a-z]{5}Match any five lower-case characters together. ex: bubba matches, Bubba doesn't match, BUBBA doesn't match, asdjo matches

重复扩展到字符类,因此: [az]{5} 将任意五个小写字符匹配在一起。例如:bubba 匹配,Bubba 不匹配,BUBBA 不匹配,asdjo 匹配

[A-Z]{2,5}Match two to five upper-case characters together. ex: bubba doesn't match, Bubba doesn't match, BUBBA matches, BUBBETTE doesn't match

[AZ]{2,5} 将两到五个大写字符匹配在一起。例如:bubba 不匹配,Bubba 不匹配,BUBBA 匹配,BUBBETTE 不匹配

[0-9]{4,8}Match four to eight numbers together. ex: bubba doesn't match, 15835 matches, 44 doesn't match, 3456876353456 doesn't match

[0-9]{4,8}匹配四到八个数字。例如: bubba 不匹配,15835 个匹配,44 个不匹配,3456876353456 不匹配

[a3g]{2}Match an a OR 3 OR g if they show up twice together. ex: aa matches, ba doesn't match, 33 matches, 38 doesn't match, a3 DOESN'T match

[a3g]{2}如果它们一起出现两次,则匹配 a OR 3 OR g。例如:aa 匹配,ba 不匹配,33 个匹配,38 个不匹配,a3 不匹配

Now let's look at your regex: [^a-z]|[^A-Z]Translation: Match anything as long as it is NOT a lowercase letter OR an upper-case letter.

现在让我们看看你的正则表达式: [^az]|[^AZ]翻译:匹配任何东西,只要它不是小写字母或大写字母。

To fix it so it meets your needs, we would rewrite it like this: Step 1: Remove the negation [a-z]|[A-Z]Translation: Find any lowercase letter OR uppercase letter.

为了修复它以满足您的需求,我们将其重写如下: 步骤 1:删除否定 [az]|[AZ]翻译:找到任何小写字母或大写字母。

Step 2: While not stricly needed, let's clean up the OR logic a bit [a-zA-Z]Translation: Find any lowercase letter OR uppercase letter. Same as above but now using only a single set of [].

第 2 步:虽然不是非常需要,但让我们稍微清理一下 OR 逻辑 [a-zA-Z]翻译:找到任何小写字母或大写字母。与上面相同,但现在只使用一组 []。

Step 3: Now let's indicate "length" [a-zA-Z]{1,25}Translation: Find any lowercase letter OR uppercase letter repeated one to twenty-five times.

第 3 步:现在让我们指出“长度” [a-zA-Z]{1,25}翻译:找到任何重复一到二十五次的小写字母或大写字母。

This is where things get funky. You might think you were done here and you may well be depending on the technology you are using.

这就是事情变得时髦的地方。您可能认为到这里就完成了,并且很可能取决于您使用的技术。

Strictly speaking the regex [a-zA-Z]{1,25}will match one to twenty-five upper or lower-case letters ANYWHEREon a line:

严格地说正则表达式[A-ZA-Z] {1,25}将匹配一个到25上或小写字母ANYWHERE上一行:

[a-zA-Z]{1,25}a matches, aZgD matches, BUBBA matches, 243242hello242552 MATCHES

[a-zA-Z]{1,25}a 匹配,aZgD 匹配,BUBBA 匹配,243242hello242552 MATCHES

In fact, every example I have given so far will do the same. If that is what you want then you are in good shape but based on your question, I'm guessing you ONLY want one to twenty-five upper or lower-case letters on the entire line. For that we turn to anchors. Anchors allow us to specify those pesky details:

事实上,到目前为止我给出的每一个例子都会做同样的事情。如果这是你想要的,那么你的状态很好,但根据你的问题,我猜你只想要整行一到二十五个大写或小写字母。为此,我们转向锚点。锚允许我们指定那些讨厌的细节:

^beginning of a line
(I know, we just used this for negation earlier, don't get me started)

^行首
(我知道,我们之前只是用它来否定,不要让我开始)

$end of a line

$行尾

We can use them like this:

我们可以这样使用它们:

^a{3}From the beginning of the line match a three times together ex: aaa matches, 123aaa doesn't match, aaa123 matches

^a{3}从行首开始匹配 a 三次,例如:aaa 匹配,123aaa 不匹配,aaa123 匹配

a{3}$Match a three times together at the end of a line ex: aaa matches, 123aaa matches, aaa123 doesn't match

a{3}$在一行的末尾将 a 匹配 3 次 ex: aaa 匹配,123aaa 匹配,aaa123 不匹配

^a{3}$Match a three times together for the ENTIREline ex: aaa matches, 123aaa doesn't match, aaa123 doesn't match

^a{3}$整个行匹配 a 三次,例如:aaa 匹配,123aaa 不匹配,aaa123 不匹配

Notice that aaa matches in all cases because it has three a's at the beginning and end of the line technically speaking.

请注意, aaa 在所有情况下都匹配,因为从技术上讲,它在行的开头和结尾都有三个 a。

So the final, technically correct solution, for finding a "word" that is "up to five characters long" on a line would be:

因此,在一行中查找“最多五个字符长”的“单词”的最终技术上正确的解决方案是:

^[a-zA-Z]{1,25}$

^[a-zA-Z]{1,25}$

The funky part is that some technologies implicitly put anchors in the regex for you and some don't. You just have to test your regex or read the docs to see if you have implicit anchors.

时髦的部分是有些技术隐式地为您在正则表达式中放置了锚点,而有些则没有。您只需要测试您的正则表达式或阅读文档以查看您是否有隐式锚点。

回答by SO User

/// <summary>
/// Checks if string contains only letters a-z and A-Z and should not be more than 25 characters in length
/// </summary>
/// <param name="value">String to be matched</param>
/// <returns>True if matches, false otherwise</returns>
public static bool IsValidString(string value)
{
    string pattern = @"^[a-zA-Z]{1,25}$";
    return Regex.IsMatch(value, pattern);
}