vba excel中的模式匹配计数(正则表达式和vba)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15554132/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 20:10:14  来源:igfitidea点击:

Pattern match count in excel (regex & vba)

regexexcelvbapattern-matching

提问by James Burt

I have an Office 2007 .XLSXfile containing over 5000 records like the below (a single cell with multiple rows of text). The issue: on an adjacent cell, put count of incidents from the cell. Looking at the cell data for A1 you can see 3 incidents:

我有一个 Office 2007.XLSX文件,其中包含 5000 多条记录,如下所示(具有多行文本的单个单元格)。问题:在相邻的单元格上,放置来自单元格的事件计数。查看 A1 的单元格数据,您可以看到 3 个事件:

Cell A1:

单元格 A1:

1/15/2013 1:30:11 AM Userx
Had to reboot system
1/15/2013 1:32:11 AM Userx
System running finished rebooting and appears to be working
11/15/2013 12:30:11 AM Userx
System hung again

The problem is that the date value isn't consistent. The day, month, and hour can be single or double digit, but they are always noted on a new line.

问题是日期值不一致。日、月和小时可以是一位数或两位数,但它们总是在新行中注明。

My code solution was to take the cell, split it on line break, trim everything 5 characters after the last ':' and evaluate the the result against my regular expression. After that, some basic tally and text insert to adjacent cell.

我的代码解决方案是获取单元格,在换行符处拆分它,在最后一个 ':' 之后修剪所有 5 个字符,并根据我的正则表达式评估结果。之后,一些基本的计数和文本插入到相邻的单元格。

The below is an example of how the function would be called.

下面是如何调用该函数的示例。

'calling function from another source:

thecount = CountOfDateValues(Range("a1").Value) 'get count
Range("b1").Value = thecount 'put count to adjacent cell

Is there any code that will take the string value and return a count of matches to the regular expression?

是否有任何代码可以获取字符串值并返回与正则表达式匹配的计数?

回答by sir KitKat

You can also include newlines in the Pattern expression by using \n. this way, you don't have to split the text in an array:

您还可以使用 \n 在 Pattern 表达式中包含换行符。这样,您不必将文本拆分为数组:

Private Function String_CountRegExp_Debug()

    'Input of the test text
    Dim TestText As String
    TestText = "1/15/2013 1:30:11 AM Userx" & vbNewLine & _
            "Had to reboot system" & vbNewLine & _
            "1/15/2013 1:32:11 AM Userx" & vbNewLine & _
            "System running finished rebooting and appears to be working" & vbNewLine & _
            "11/15/2013 12:30:11 AM Userx" & vbNewLine & _
            "System hung again"

    'Input of the Pattern
    Dim RE_Pattern As String
    RE_Pattern = "(\d{1,2})\/(\d{1,2})\/(\d{4})\s(\d{1,2}):(\d{1,2}):(\d{1,2})\s([A,P]M).*\n"

    Debug.Print String_CountRegExp(TestText, RE_Pattern)

End Function

Public Function String_CountRegExp(Text As String, Pattern As String) As Long
'Count the number of Pattern matches in a string.

    'Set up regular expression object
    Dim RE As New RegExp
    RE.Pattern = Pattern
    RE.Global = True
    RE.IgnoreCase = True
    RE.MultiLine = True
    'Retrieve all matches
    Dim Matches As MatchCollection
    Set Matches = RE.Execute(Text)
    'Return the corrected count of matches
    String_CountRegExp = Matches.Count

End Function

回答by James Burt

Below is the VBA code of a function that takes in a string value and returns count of matches to the regular expression. I hope it is of use for someone.

下面是一个函数的 VBA 代码,它接受一个字符串值并返回与正则表达式匹配的计数。我希望它对某人有用。

Function CountOfDateValues(thetext)

Dim data() As String 
Dim yourInput As String
yourInput = thetext 
Dim TheSplitter As String
TheSplitter = Chr(10) 'the character that represents a line break

data = Split(yourInput, TheSplitter ) ' creates an array of strings for each line in the cell
Dim re
Set re = CreateObject("VBscript.regexp")
'regular expression that matches ##/##/#### ##:##:## ##
re.Pattern = "(?=\d)^(?:(?!(?:10\D(?:0?[5-9]|1[0-4])\D(?:1582))|(?:0?9\D(?:0?[3-9]|1[0-3])\D(?:1752)))((?:0?[13578]|1[02])|(?:0?[469]|11)(?!\/31)(?!-31)(?!\.31)|(?:0?2(?=.?(?:(?:29.(?!000[04]|(?:(?:1[^0-6]|[2468][^048]|[3579][^26])00))(?:(?:(?:\d\d)(?:[02468][048]|[13579][26])(?!\x20BC))|(?:00(?:42|3[0369]|2[147]|1[258]|09)\x20BC))))))|(?:0?2(?=.(?:(?:\d\D)|(?:[01]\d)|(?:2[0-8])))))([-.\/])(0?[1-9]|[12]\d|3[01])(?!0000)((?=(?:00(?:4[0-5]|[0-3]?\d)\x20BC)|(?:\d{4}(?!\x20BC)))\d{4}(?:\x20BC)?)(?:$|(?=\x20\d)\x20))?((?:(?:0?[1-9]|1[012])(?::[0-5]\d){0,2}(?:\x20[aApP][mM]))|(?:[01]\d|2[0-3])(?::[0-5]\d){1,2})?$"
re.Global = True

Dim t As String
Dim theCount As Integer
theCount = 0
For i = LBound(data) To UBound(data) 'from first item in array to last item in array

        For Each Match In re.Execute(Left(data(i), InStrRev(data(i), ":") + 5))
            'from start of string to 5 characters past the last ':' of string
            theCount = theCount + 1
        Next
    Next

CountOfDateValues = theCount 

End Function

Referencing urls:

参考网址:

MS Access 2003 VBA String Split on Line Break

MS Access 2003 VBA 字符串在换行符处拆分

http://sourceforge.net/projects/regexbuilder/files/regexbuilder/1.4.0/

http://sourceforge.net/projects/regexbuilder/files/regexbuilder/1.4.0/

This tool made testing my regular expression against various date formats remarkably easy.

这个工具使得针对各种日期格式测试我的正则表达式变得非常容易。

http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryid=5&p=2

http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryid=5&p=2

I was able to save a lot of time crafting a regular expression by using a precrafted one from here. Sadly did not learn much by do1ing so, but I believe I saved a lot of time on this 'we need it done now' request.

通过使用这里预制的正则表达式,我能够节省大量时间来制作正则表达式。遗憾的是,这样做并没有学到很多东西,但我相信我在这个“我们现在需要完成”的请求上节省了很多时间。

*Note: There is a window for a false positives if someone starts their worklog note with a timetamp, I noted this to the customer and they were fine with it.

*注意:如果有人用时间戳开始他们的工作日志记录,则有一个误报窗口,我向客户记录了这一点,他们对此表示满意。