vba 如何通过VB宏遍历Word文档中的文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17126690/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to iterate through text in the Word document by VB macro
提问by Yoda
I wanted to count chars in the Word document by Macro I have no idea how to get reference two the text in visual basic macro and go through it.
我想通过宏计算 Word 文档中的字符我不知道如何在 Visual Basic 宏中引用两个文本并通过它。
I would like to count how many of every char was in the document. For example in document:
我想计算文档中每个字符的数量。例如在文档中:
ABZBB
A x 1
B x 3
Z x 1
Sub Macro1()
Dim Box As Shape
Set Box = ActiveDocument.Shapes.AddTextbox( _
Orientation:=msoTextOrientationHorizontal, _
Left:=50, Top:=50, Width:=200, Height:=400)
Box.TextFrame.TextRange.Text = "My text comes this way" + Chr(10)
Dim s As String
Application.ScreenUpdating = False
docLength = ActiveDocument.Range.Characters.Count
Box.TextFrame.TextRange.Text = Box.TextFrame.TextRange.Text + "Text length is: " + Str(docLength) + Chr(10)
Dim arr(128) As Integer
Dim character As Integer
For i = 1 To docLength - 1
character = Asc(ActiveDocument.Range.Characters(i))
If iAsc >= 0 And iAsc <= 127 Then
arr(character) = arr(character) + 1
End If
Next i
End Sub
回答by Doug Glancy
Using VBA, to count the number of characters in the active document do:
使用 VBA 计算活动文档中的字符数,请执行以下操作:
ActiveDocument.Range.ComputeStatistics(wdStatisticCharacters)
or
或者
Activedocument.Range.Characters.Count
To get the count for the current selection:
要获取当前选择的计数:
Selection.Range.ComputeStatistics(wdStatisticCharacters)
or
或者
Selection.Range.Characters.Count
The second method in each example counts spaces as characters, the first doesn't.
每个示例中的第二种方法将空格计为字符,第一种方法则不然。
EDIT: I did some speed testing on various methods to count the instances of a char in a document.Regular expressions and stuffing the document contents into a string are fastest - many times faster than looping through each character or FIND
编辑:我对计算文档中字符实例的各种方法进行了一些速度测试。正则表达式和将文档内容填充到字符串中是最快的 - 比循环遍历每个字符或FIND
For my test document I copied the contents of this web page into a Word document. As an accuracy check, I used Word's Find
function/panel to find the number of instances of lower case "a". Before I edited this answer that was 409 instances.
对于我的测试文档,我将此网页的内容复制到 Word 文档中。作为准确性检查,我使用 Word 的Find
函数/面板来查找小写“a”的实例数。在我编辑这个答案之前,它是 409 个实例。
I then created four functions to count the number of instances of a character (any string actually) in a Word document. The first simply loops through each character in the doc, similar to Andrew's. The second uses the Find
function. The third stuffs the contents of the document into a string and loops through it. The fourth does the same thing but check the matches using a regular expression:
然后我创建了四个函数来计算 Word 文档中字符(实际上是任何字符串)的实例数。第一个简单地循环遍历文档中的每个字符,类似于 Andrew 的。第二个使用Find
函数。第三个将文档的内容填充到一个字符串中并循环遍历它。第四个做同样的事情,但使用正则表达式检查匹配:
Function GetCharCountLoop(doc As Word.Document, char As String) As Long
Dim i As Long
Dim CharCount As Long
With doc.Content.Characters
For i = 1 To .Count
If .Item(i) = char Then
CharCount = CharCount + 1
End If
Next i
End With
GetCharCountLoop = CharCount
End Function
Function GetCharCountFind(doc As Word.Document, char As String) As Long
Dim i As Long
Dim CharCount As Long
With doc.Content.Find
Do While .Execute(FindText:=char, Forward:=True, MatchWholeWord:=False, MatchCase:=True) = True
CharCount = CharCount + 1
Loop
GetCharCountFind = CharCount
End With
End Function
Function GetCharCountString(doc As Word.Document, char As String) As Long
Dim chars As String
Dim i As Long
Dim CharCount As Long
chars = doc.Content
For i = 1 To Len(chars)
If Mid$(chars, i, 1) = char Then
CharCount = CharCount + 1
End If
Next i
GetCharCountString = CharCount
End Function
Function GetCharCountRegex(doc As Word.Document, char As String) As Long
Dim chars As String
Dim CharCount As Long
Dim objRegExp As Object
chars = doc.Content
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Pattern = char
.IgnoreCase = False
.Global = True
CharCount = .Execute(chars).Count
End With
GetCharCountRegex = CharCount
End Function
I then tested them using this sub, running a single loop:
然后我使用这个子测试它们,运行一个循环:
Sub TimeMethods()
Dim char As String
Dim CharCount As Long
Dim LoopCounter As Long
Dim NumLoops As Long
Dim StartTime As Double
char = "a"
NumLoops = 1
StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountLoop(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime
StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountFind(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime
StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountString(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime
StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountRegex(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime
End Sub
The results are dramatic:
结果是戏剧性的:
GetCharCountLoop - 514.3046875 seconds
GetCharCountLoop - 514.3046875 秒
GetCharCountFind - 0.5859375 seconds
GetCharCountFind - 0.5859375 秒
GetCharCountString - 0.015625 seconds
GetCharCountString - 0.015625 秒
GetCharCountRegex - 0.015625 seconds
GetCharCountRegex - 0.015625 秒
I dropped GetCharCountLoop from the running and ran the other three 100 times. According to this rudimentary timing, stuffing the contents into a string and counting, or using a regular expression, are almost 50 times faster than the Find method:
我从运行中删除了 GetCharCountLoop 并运行了其他三个 100 次。根据这个基本的时序,将内容塞入字符串并计数,或者使用正则表达式,几乎比 Find 方法快 50 倍:
GetCharCountFind - 30.984375 seconds
GetCharCountFind - 30.984375 秒
GetCharCountString - 0.6328125 seconds
GetCharCountString - 0.6328125 秒
GetCharCountRegex - 0.578125 seconds
GetCharCountRegex - 0.578125 秒
Note that the slowness of the first method, looping through each character is most evident with longer docs. In my initial testing - a file with just a few words - it was only twice as slow as the Find method.
请注意,第一种方法的缓慢,循环遍历每个字符在较长的文档中最为明显。在我最初的测试中——一个只有几个词的文件——它的速度只有 Find 方法的两倍。
Also note that I originally turned off ScreenUpdating
per Andrew's subroutine, but it seems that makes no difference.
另请注意,我最初关闭了ScreenUpdating
每个安德鲁的子程序,但似乎没有区别。
回答by Andy G
Below is a simplistic, and perhaps slow, example of counting individual letters (and some other characters) in a document.
下面是一个计算文档中单个字母(和一些其他字符)的简单且可能很慢的示例。
Sub CountChars()
Dim iCount(57) As Integer
Dim x As Integer
Dim iTotal As Integer
Dim iAsc As Integer
Application.ScreenUpdating = False
iTotal = ActiveDocument.Range.Characters.Count
For x = 1 To iTotal
iAsc = Asc(ActiveDocument.Range.Characters(x))
If iAsc >= 65 And iAsc <= 122 Then
iCount(iAsc - 65) = iCount(iAsc - 65) + 1
End If
Next x
For x = 0 To 57
Debug.Print x, iCount(x)
Next x
Application.ScreenUpdating = True
End Sub
Change to
改成
Debug.Print Chr(x + 65), iCount(x)
to display the characters themselves.
来显示字符本身。
It may be possible to use Find
(somehow) to count occurrences of characters; otherwise it would require Regex.
可以使用Find
(以某种方式)计算字符的出现次数;否则它需要正则表达式。
Alternative using Replace:
替代使用替换:
'Tools, References: Microsoft Scripting Runtime
Sub CountCharsWithReplace()
Dim doc As Document
Dim rDupe As Range
Dim dicChars As Scripting.Dictionary
Dim s As String
Dim iTotalChars As Integer
Dim iTempChars As Integer
Dim iDiff As Integer
Dim n As Integer
Dim blnExec As Boolean
Dim lett As Variant
Application.ScreenUpdating = False
Set doc = ActiveDocument
iTotalChars = doc.Range.Characters.Count
Set rDupe = doc.Range
Set dicChars = New Scripting.Dictionary
Do While rDupe.Characters.Count > 1
s = rDupe.Characters(1).Text
blnExec = rDupe.Find.Execute(s, , , , , , , , , "", wdReplaceAll)
iTempChars = doc.Range.Characters.Count
iDiff = iTotalChars - iTempChars
iTotalChars = iTempChars
If Asc(s) >= 65 And Asc(s) <= 122 Then
dicChars.Add s, iDiff
End If
n = n + 1
Loop
ActiveDocument.Undo Times:=n
Application.ScreenUpdating = True
For Each lett In dicChars.Keys
Debug.Print lett, dicChars(lett)
Next lett
End Sub