vba 正则表达式 Word 宏在彼此的范围内找到两个单词,然后将这些单词斜体化?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11354909/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex Word Macro that finds two words within a range of each other and then italicizes those words?
提问by pavja2
So, I'm just beginning to understand Regular Expressions and I've found the learning curve fairly steep. However, stackoverflow has been immensely helpful in the process of my experimenting. There is a particular word macro that I would like to write but I have not figured out a way to do it. I would like to be able to find two words within 10 or so words of each other in a document and then italicize those words, if the words are more than 10 words apart or are in a different order I would like the macro not to italicize those words.
所以,我才刚刚开始了解正则表达式,我发现学习曲线相当陡峭。然而,stackoverflow 在我的实验过程中非常有帮助。我想写一个特定的单词宏,但我还没有想出一种方法来做到这一点。我希望能够在文档中找到彼此相距 10 个左右的单词内的两个单词,然后将这些单词斜体化,如果单词相距超过 10 个单词或顺序不同,我希望宏不要斜体那些话。
I have been using the following regular expression:
我一直在使用以下正则表达式:
\bPanama\W+(?:\w+\W+){0,10}?Canal\b
However it only lets me manipulate the entire string as a whole including random words in between. Also the .Replace function only lets me replace that string with a different string not change formatting styles.
但是,它只能让我将整个字符串作为一个整体进行操作,包括中间的随机单词。此外, .Replace 函数只允许我用不同的字符串替换该字符串,而不会更改格式样式。
Does any more experienced person have an idea as to how to make this work? Is it even possible to do?
有没有更有经验的人知道如何进行这项工作?甚至有可能做到吗?
EDIT: Here is what I have so far. There are two problems I am having. First I don't know how to only select the words "Panama" and "Canal" from within a matched Regular Expression and replace only those words (and not the intermediate words). Second, I just don't know how to replace a Regexp that is matched with a different format, only a different string of text - probably just as a result of a lack of familiarity with word macros.
编辑:这是我到目前为止所拥有的。我有两个问题。首先,我不知道如何从匹配的正则表达式中只选择“巴拿马”和“运河”这两个词,并只替换这些词(而不是中间词)。其次,我只是不知道如何替换与不同格式匹配的正则表达式,只有不同的文本字符串 - 可能只是由于对 word 宏不熟悉。
Sub RegText()
Dim re As regExp
Dim para As Paragraph
Dim rng As Range
Set re = New regExp
re.Pattern = "\bPanama\W+(?:\w+\W+){0,10}?Canal\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs
Set rng = para.Range
rng.MoveEnd unit:=wdCharacter, Count:=-1
Text$ = rng.Text + "Modified"
rng.Text = re.Replace(rng.Text, Text$)
Next para
End Sub
Ok, thanks to help from Tim Williams below I got the following solution together, it's more than a little clumsy in some respects and it is by no means pure regexp but it doesget the job done. If anyone has a better solution or idea about how to go about this I'd be fascinated to hear it though. Again, my brute forcing the changes with the search and replace feature is a little embarrassingly crude but at least it works...
好的,感谢下面 Tim Williams 的帮助,我得到了以下解决方案,它在某些方面有点笨拙,它绝不是纯粹的正则表达式,但它确实完成了工作。如果有人对如何解决这个问题有更好的解决方案或想法,我会很高兴听到它。再一次,我用搜索和替换功能强行进行更改有点令人尴尬,但至少它有效......
Sub RegText()
Dim re As regExp
Dim para As Paragraph
Dim rng As Range
Dim txt As String
Dim allmatches As MatchCollection, m As match
Set re = New regExp
re.pattern = "\bPanama\W+(?:\w+\W+){0,13}?Canal\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs
txt = para.Range.Text
'any match?
If re.Test(txt) Then
'get all matches
Set allmatches = re.Execute(txt)
'look at each match and hilight corresponding range
For Each m In allmatches
Debug.Print m.Value, m.FirstIndex, m.Length
Set rng = para.Range
rng.Collapse wdCollapseStart
rng.MoveStart wdCharacter, m.FirstIndex
rng.MoveEnd wdCharacter, m.Length
rng.Font.ColorIndex = wdOrange
Next m
End If
Next para
Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = True
With Selection.Find
.Text = "Panama"
.Replacement.Text = "Panama"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = True
With Selection.Find
.Text = "Canal"
.Replacement.Text = "Canal"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.ColorIndex = wdBlack
With Selection.Find
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
采纳答案by Tim Williams
I'm a long way off being a decent Word programmer, but this might get you started.
我离成为一个体面的 Word 程序员还有很长的路要走,但这可能会让你开始。
EDIT: updated to include a parameterized version.
编辑:更新为包含参数化版本。
Sub Tester()
HighlightIfClose ActiveDocument, "panama", "canal", wdBrightGreen
HighlightIfClose ActiveDocument, "red", "socks", wdRed
End Sub
Sub HighlightIfClose(doc As Document, word1 As String, _
word2 As String, clrIndex As WdColorIndex)
Dim re As RegExp
Dim para As Paragraph
Dim rng As Range
Dim txt As String
Dim allmatches As MatchCollection, m As match
Set re = New RegExp
re.Pattern = "\b" & word1 & "\W+(?:\w+\W+){0,10}?" _
& word2 & "\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs
txt = para.Range.Text
'any match?
If re.Test(txt) Then
'get all matches
Set allmatches = re.Execute(txt)
'look at each match and hilight corresponding range
For Each m In allmatches
Debug.Print m.Value, m.FirstIndex, m.Length
Set rng = para.Range
rng.Collapse wdCollapseStart
rng.MoveStart wdCharacter, m.FirstIndex
rng.MoveEnd wdCharacter, Len(word1)
rng.HighlightColorIndex = clrIndex
Set rng = para.Range
rng.Collapse wdCollapseStart
rng.MoveStart wdCharacter, m.FirstIndex + (m.Length - Len(word2))
rng.MoveEnd wdCharacter, Len(word2)
rng.HighlightColorIndex = clrIndex
Next m
End If
Next para
End Sub
回答by jay
If you're after just doing each 2 words at a time, this worked for me, following your practice lines.
如果你每次只做 2 个单词,这对我有用,按照你的练习路线。
foo([a-zA-Z0-9]+? ){0,10}bar
Explanation:will grab word 1 (foo
), then match anything that is a word of alphanumeric characters ([a-zA-Z0-9]+?
) followed by a space (), 10 times (
{0,10}
), then word 2 (bar
).
解释:将抓取单词 1 ( foo
),然后匹配任何由字母数字字符 ( [a-zA-Z0-9]+?
) 后跟一个空格 ( )、10 次 (
{0,10}
) 和单词 2 ( bar
) 组成的单词。
This doesn'tinclude full stops (didn't know if you wanted them), but if you want to just add .
after 0-9
in the regex.
这不包括句号(不知道您是否想要它们),但是如果您只想在正则表达式中添加.
after 0-9
。
So your (pseudocode) syntax will be similarto:
所以你的(伪代码)语法类似于:
$matches = preg_match_all(); // Your function to get regex matches in an array
foreach (those matches) {
replace(KEY_WORD, <i>KEY_WORD</i>);
}
Hopefully it helps. Testing below, highlighted what it matched.
希望它有帮助。下面进行测试,突出显示它匹配的内容。
Worked:
工作过:
The foo this that bar
blah
在foo this that bar
胡说
The foo economic order war bar
这 foo economic order war bar
Didn't Work
没用
The foo economic order. war bar
foo 经济秩序。War酒吧
The global foo order has been around for several centuries, over this period of time people have evolved different and intricate trade relationships dealing with situations such as agriculture and bar
全球 foo 订单已经存在了几个世纪,在这段时间里,人们已经发展出不同而复杂的贸易关系,以应对农业和酒吧等情况