vba excel中的模式匹配计数(正则表达式和vba)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15554132/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pattern match count in excel (regex & vba)
提问by James Burt
I have an Office 2007 .XLSX
file containing over 5000 records like the below (a single cell with multiple rows of text). The issue: on an adjacent cell, put count of incidents from the cell. Looking at the cell data for A1 you can see 3 incidents:
我有一个 Office 2007.XLSX
文件,其中包含 5000 多条记录,如下所示(具有多行文本的单个单元格)。问题:在相邻的单元格上,放置来自单元格的事件计数。查看 A1 的单元格数据,您可以看到 3 个事件:
Cell A1:
单元格 A1:
1/15/2013 1:30:11 AM Userx
Had to reboot system
1/15/2013 1:32:11 AM Userx
System running finished rebooting and appears to be working
11/15/2013 12:30:11 AM Userx
System hung again
The problem is that the date value isn't consistent. The day, month, and hour can be single or double digit, but they are always noted on a new line.
问题是日期值不一致。日、月和小时可以是一位数或两位数,但它们总是在新行中注明。
My code solution was to take the cell, split it on line break, trim everything 5 characters after the last ':' and evaluate the the result against my regular expression. After that, some basic tally and text insert to adjacent cell.
我的代码解决方案是获取单元格,在换行符处拆分它,在最后一个 ':' 之后修剪所有 5 个字符,并根据我的正则表达式评估结果。之后,一些基本的计数和文本插入到相邻的单元格。
The below is an example of how the function would be called.
下面是如何调用该函数的示例。
'calling function from another source:
thecount = CountOfDateValues(Range("a1").Value) 'get count
Range("b1").Value = thecount 'put count to adjacent cell
Is there any code that will take the string value and return a count of matches to the regular expression?
是否有任何代码可以获取字符串值并返回与正则表达式匹配的计数?
回答by sir KitKat
You can also include newlines in the Pattern expression by using \n. this way, you don't have to split the text in an array:
您还可以使用 \n 在 Pattern 表达式中包含换行符。这样,您不必将文本拆分为数组:
Private Function String_CountRegExp_Debug()
'Input of the test text
Dim TestText As String
TestText = "1/15/2013 1:30:11 AM Userx" & vbNewLine & _
"Had to reboot system" & vbNewLine & _
"1/15/2013 1:32:11 AM Userx" & vbNewLine & _
"System running finished rebooting and appears to be working" & vbNewLine & _
"11/15/2013 12:30:11 AM Userx" & vbNewLine & _
"System hung again"
'Input of the Pattern
Dim RE_Pattern As String
RE_Pattern = "(\d{1,2})\/(\d{1,2})\/(\d{4})\s(\d{1,2}):(\d{1,2}):(\d{1,2})\s([A,P]M).*\n"
Debug.Print String_CountRegExp(TestText, RE_Pattern)
End Function
Public Function String_CountRegExp(Text As String, Pattern As String) As Long
'Count the number of Pattern matches in a string.
'Set up regular expression object
Dim RE As New RegExp
RE.Pattern = Pattern
RE.Global = True
RE.IgnoreCase = True
RE.MultiLine = True
'Retrieve all matches
Dim Matches As MatchCollection
Set Matches = RE.Execute(Text)
'Return the corrected count of matches
String_CountRegExp = Matches.Count
End Function
回答by James Burt
Below is the VBA code of a function that takes in a string value and returns count of matches to the regular expression. I hope it is of use for someone.
下面是一个函数的 VBA 代码,它接受一个字符串值并返回与正则表达式匹配的计数。我希望它对某人有用。
Function CountOfDateValues(thetext)
Dim data() As String
Dim yourInput As String
yourInput = thetext
Dim TheSplitter As String
TheSplitter = Chr(10) 'the character that represents a line break
data = Split(yourInput, TheSplitter ) ' creates an array of strings for each line in the cell
Dim re
Set re = CreateObject("VBscript.regexp")
'regular expression that matches ##/##/#### ##:##:## ##
re.Pattern = "(?=\d)^(?:(?!(?:10\D(?:0?[5-9]|1[0-4])\D(?:1582))|(?:0?9\D(?:0?[3-9]|1[0-3])\D(?:1752)))((?:0?[13578]|1[02])|(?:0?[469]|11)(?!\/31)(?!-31)(?!\.31)|(?:0?2(?=.?(?:(?:29.(?!000[04]|(?:(?:1[^0-6]|[2468][^048]|[3579][^26])00))(?:(?:(?:\d\d)(?:[02468][048]|[13579][26])(?!\x20BC))|(?:00(?:42|3[0369]|2[147]|1[258]|09)\x20BC))))))|(?:0?2(?=.(?:(?:\d\D)|(?:[01]\d)|(?:2[0-8])))))([-.\/])(0?[1-9]|[12]\d|3[01])(?!0000)((?=(?:00(?:4[0-5]|[0-3]?\d)\x20BC)|(?:\d{4}(?!\x20BC)))\d{4}(?:\x20BC)?)(?:$|(?=\x20\d)\x20))?((?:(?:0?[1-9]|1[012])(?::[0-5]\d){0,2}(?:\x20[aApP][mM]))|(?:[01]\d|2[0-3])(?::[0-5]\d){1,2})?$"
re.Global = True
Dim t As String
Dim theCount As Integer
theCount = 0
For i = LBound(data) To UBound(data) 'from first item in array to last item in array
For Each Match In re.Execute(Left(data(i), InStrRev(data(i), ":") + 5))
'from start of string to 5 characters past the last ':' of string
theCount = theCount + 1
Next
Next
CountOfDateValues = theCount
End Function
Referencing urls:
参考网址:
MS Access 2003 VBA String Split on Line Break
http://sourceforge.net/projects/regexbuilder/files/regexbuilder/1.4.0/
http://sourceforge.net/projects/regexbuilder/files/regexbuilder/1.4.0/
This tool made testing my regular expression against various date formats remarkably easy.
这个工具使得针对各种日期格式测试我的正则表达式变得非常容易。
http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryid=5&p=2
http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryid=5&p=2
I was able to save a lot of time crafting a regular expression by using a precrafted one from here. Sadly did not learn much by do1ing so, but I believe I saved a lot of time on this 'we need it done now' request.
通过使用这里预制的正则表达式,我能够节省大量时间来制作正则表达式。遗憾的是,这样做并没有学到很多东西,但我相信我在这个“我们现在需要完成”的请求上节省了很多时间。
*Note: There is a window for a false positives if someone starts their worklog note with a timetamp, I noted this to the customer and they were fine with it.
*注意:如果有人用时间戳开始他们的工作日志记录,则有一个误报窗口,我向客户记录了这一点,他们对此表示满意。