C#中的正则表达式替换

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16117043/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 00:03:55  来源:igfitidea点击:

Regular expression replace in C#

c#regex

提问by Curtis

I'm fairly new to using regular expressions, and, based on a few tutorials I've read, I'm unable to get this step in my Regex.Replace formatted properly.

我对使用正则表达式还很陌生,并且根据我读过的一些教程,我无法正确格式化 Regex.Replace 中的这一步。

Here's the scenario I'm working on... When I pull my data from the listbox, I want to format it into a CSVlike format, and then save the file. Is using the Replace option an ideal solution for this scenario?

这是我正在处理的场景......当我从列表框中提取数据时,我想将其格式化为类似CSV 的格式,然后保存文件。使用“替换”选项是这种情况的理想解决方案吗?

Before the regular expression formatting example.

在正则表达式格式化示例之前。

FirstName LastName Salary    Position
-------------------------------------
John      Smith    0,000.00  M

Proposed format after regular expression replace

正则表达式替换后的建议格式

John Smith,100000,M

Current formatting status output:

当前格式化状态输出:

John,Smith,100000,M

*Note - is there a way I can replace the first comma with a whitespace?

*注意 - 有没有办法用空格替换第一个逗号?

Snippet of my code

我的代码片段

using(var fs = new FileStream(filepath, FileMode.OpenOrCreate, FileAccess.Write))
{
    using(var sw = new StreamWriter(fs))
    {
        foreach (string stw in listBox1.Items)
        {
            StringBuilder sb = new StringBuilder();
            sb.AppendLine(stw);

            //Piecing the list back to the original format
            sb_trim = Regex.Replace(stw, @"[$,]", "");
            sb_trim = Regex.Replace(sb_trim, @"[.][0-9]+", "");
            sb_trim = Regex.Replace(sb_trim, @"\s", ",");
            sw.WriteLine(sb_trim);
        }
    }
}

采纳答案by Anirudha

You can do it this with two replace's

你可以用两个替换来做到这一点

//let stw be "John Smith 0,000.00 M"

sb_trim = Regex.Replace(stw, @"\s+$|\s+(?=\w+$)", ",");
//sb_trim becomes "John Smith,100,000.00,M"

sb_trim = Regex.Replace(sb_trim, @"(?<=\d),(?=\d)|[.]0+(?=,)", "");
//sb_trim becomes "John Smith,100000,M"

sw.WriteLine(sb_trim);

回答by Patrick D'Souza

Add the following 2 lines

添加以下两行

var regex = new Regex(Regex.Escape(","));
sb_trim = regex.Replace(sb_trim, " ", 1);

If sb_trim= John,Smith,100000,Mthe above code will return "John Smith,100000,M"

如果 sb_trim= John,Smith,100000,M上面的代码将返回“John Smith,100000,M”

回答by Zenexer

Try this::

尝试这个::

sb_trim = Regex.Replace(stw, @"(\D+)\s+$([\d,]+)\.\d+\s+(.)",
    m => string.Format(
        "{0},{1},{2}",
        m.Groups[1].Value,
        m.Groups[2].Value.Replace(",", string.Empty),
        m.Groups[3].Value));

This is about as clean an answer as you'll get, at least with regexes.

这与您得到的答案一样清晰,至少对于正则表达式。

  • (\D+): First capture group. One or more non-digit characters.
  • \s+\$: One or more spacing characters, then a literal dollar sign ($).
  • ([\d,]+): Second capture group. One or more digits and/or commas.
  • \.\d+: Decimal point, then at least one digit.
  • \s+: One or more spacing characters.
  • (.): Third capture group. Any non-line-breaking character.
  • (\D+): 第一个捕获组。一个或多个非数字字符。
  • \s+\$: 一个或多个空格字符,然后是文字美元符号 ($)。
  • ([\d,]+): 第二个捕获组。一位或多位数字和/或逗号。
  • \.\d+: 小数点,然后至少一位。
  • \s+: 一个或多个空格字符。
  • (.): 第三个捕获组。任何非换行符。

The second capture group additionally needs to have its commas stripped. You could do this with another regex, but it's really unnecessary and bad for performance. This is why we need to use a lambda expression and string format to piece together the replacement. If it weren't for that, we could just use this as the replacement, in place of the lambda expression:

第二个捕获组还需要去除逗号。你可以用另一个正则表达式来做到这一点,但这真的没有必要,而且对性能不利。这就是为什么我们需要使用 lambda 表达式和字符串格式来拼凑替换。如果不是这样,我们可以使用它作为替代,代替 lambda 表达式:

",,"