C# 过滤字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/907995/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 02:33:57  来源:igfitidea点击:

Filter a String

c#stringfiltering

提问by Gabe

I want to make sure a string has only characters in this range

我想确保一个字符串只有这个范围内的字符

[a-z] && [A-Z] && [0-9] && [-]

[az] && [AZ] && [0-9] && [-]

so all letters and numbers plus the hyphen. I tried this...

所以所有字母和数字加上连字符。我试过这个...

C# App:

C# 应用程序:

        char[] filteredChars = { ',', '!', '@', '#', '$', '%', '^', '&', '*', '(', ')', '_', '+', '=', '{', '}', '[', ']', ':', ';', '"', '\'', '?', '/', '.', '<', '>', '\', '|' };
        string s = str.TrimStart(filteredChars);

This TrimStart() only seems to work with letters no otehr characters like $ % etc

这个 TrimStart() 似乎只适用于没有其他字符的字母,如 $ % 等

Did I implement it wrong? Is there a better way to do it?

我执行错了吗?有没有更好的方法来做到这一点?

I just want to avoid looping through each string's index checking because there will be a lot of strings to do...

我只是想避免遍历每个字符串的索引检查,因为会有很多字符串要做......

Thoughts?

想法?

Thanks!

谢谢!

采纳答案by Tomas Aschan

This seems like a perfectly valid reason to use a regular expression.

这似乎是使用正则表达式的完全正当理由。

bool stringIsValid = Regex.IsMatch(inputString, @"^[a-zA-Z0-9\-]*?$");

In response to miguel's comment, you could do this to remove all unwanted characters:

针对 miguel 的评论,您可以这样做以删除所有不需要的字符:

string cleanString = Regex.Replace(inputString, @"[^a-zA-Z0-9\-]", "");

Note that the caret (^) is now placed insidethe character class, thus negating it (matching any non-allowed character).

请注意,脱字符 ( ^) 现在被放置字符类中,从而否定它(匹配任何不允许的字符)。

回答by miguel

Why not just use replace instead? Trimstart will only remove the leading characters in your list...

为什么不直接使用替换呢?Trimstart 只会删除列表中的前导字符...

回答by Joel

I'm sure that with a bit more time you can come up wiht something better, but this will give you a good idea:

我敢肯定,再多花点时间你就可以想出更好的东西,但这会给你一个好主意:

public string NumberOrLetterOnly(string s)
{
    string rtn = s;
    for (int i = 0; i < s.Length; i++)
    {
        if (!char.IsLetterOrDigit(rtn[i]) && rtn[i] != '-')
        {
            rtn = rtn.Replace(rtn[i].ToString(), " ");
        }
    }
    return rtn.Replace(" ", "");
}

回答by JaredPar

Try the following

尝试以下

public bool isStringValid(string input) {
  if ( null == input ) { 
    throw new ArgumentNullException("input");
  }
  return System.Text.RegularExpressions.Regex.IsMatch(input, "^[A-Za-z0-9\-]*$");
}

回答by Judah Gabriel Himango

Here's a fun way to do it with LINQ - no ugly loops, no complicated RegEx:

这是使用 LINQ 的一种有趣方式 - 没有丑陋的循环,没有复杂的 RegEx:

private string GetGoodString(string input)
{
   var allowedChars = 
      Enumerable.Range('0', 10).Concat(
      Enumerable.Range('A', 26)).Concat(
      Enumerable.Range('a', 26)).Concat(
      Enumerable.Range('-', 1));

   var goodChars = input.Where(c => allowedChars.Contains(c));
   return new string(goodChars.ToArray());
}

Feed it "Hello, world? 123!" and it will return "Helloworld123".

喂它“你好,世界?123!” 它将返回“Helloworld123”。

回答by Tore Aurstad

I have tested these two solutions in Linqpad 5. The benefit of these is that they can be used not only for integers, but also decimals / floats with a number decimal separator, which is culture dependent. For example, in Norway we use the comma as the decimal separator, whereas in the US, the dot is used. The comma is used there as a thousands separator. Anyways, first the Linq version and then the Regex version. The most terse bit is accessing the Thread's static property for number separator, but you can compress this a bit using static at the top of the code, or better - put such functionality into C# extension methods, preferably having overloads with arbitrary Regex patterns.

我已经在 Linqpad 5 中测试了这两个解决方案。它们的好处是它们不仅可以用于整数,还可以用于带有数字小数分隔符的小数/浮点数,这取决于文化。例如,在挪威,我们使用逗号作为小数点分隔符,而在美国,则使用点。逗号在那里用作千位分隔符。无论如何,首先是 Linq 版本,然后是 Regex 版本。最简洁的一点是访问线程的数字分隔符的静态属性,但您可以使用代码顶部的静态来压缩它,或者更好 - 将此类功能放入 C# 扩展方法中,最好使用任意正则表达式模式进行重载。

string crappyNumber = @"40430dfkZZZdfldslkggh430FDFLDEFllll340-DIALNOWFORCHRISTSAKE.,CAKE-FORFIRSTDIAL920932903209032093294fa?j##R#KKL##K";

string.Join("", crappyNumber.Where(c => char.IsDigit(c)|| c.ToString() == Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator)).Dump();

new String(crappyNumber.Where(c => new Regex($"[\d]+{Thread.CurrentThread.CurrentUICulture.NumberFormat.NumberDecimalSeparator}\d+").IsMatch(c.ToString())).ToArray()).Dump();

Note to the code above, the Dump() method dumps the results to Linqpad. Your code will of course skip this very last part. Also note that we got it down to a one liner, but it is a bit verbose still and can be put into C# extension methods as suggested.

请注意上面的代码,Dump() 方法将结果转储到 Linqpad。您的代码当然会跳过最后一部分。另请注意,我们将其归结为单行代码,但它仍然有点冗长,可以按照建议放入 C# 扩展方法中。

Also, instead of string.join, newing a new String object is more compact syntax and less error prone.

此外,与 string.join 不同,newing 一个新的 String 对象是更紧凑的语法并且更不容易出错。

We got a crappy number as input, but we managed to get our number in the end! And it is Culture aware in C#!

我们输入了一个蹩脚的数字,但我们最终设法得到了我们的数字!它是 C# 中的文化意识!