VBA - 查找前面的 html 标签
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20912246/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
VBA - Find preceding html tag
提问by LoxBagel
Say I have HTML source that looks like this
假设我有这样的 HTML 源代码
<div id="book-info">
<span class="title">Weather</span>
<span class="title">Title Of Book</span>
<p><a href="http://test.com?MMC_ID=34343">Buy Now</a></p>
</div>
What I need returned is "Title Of Book"
我需要返回的是“书名”
There are numerous instances of span class="title" but the one I need immediately precedes the only MMC_ID tag on the page, so I can use MMC_ID as a marker to get close to the span tag I need.
有许多 span class="title" 的实例,但我需要的实例紧跟在页面上唯一的 MMC_ID 标记之前,因此我可以使用 MMC_ID 作为标记来接近我需要的 span 标记。
Question: How can I say "Grab the contents of the very first span tag to the left of MMC_ID?
问题:我怎么说“抓取 MMC_ID 左侧第一个 span 标签的内容?
The below code works sometimes, but there is a variable number of span tags on the page so it fails when that deviation occurs.
下面的代码有时会起作用,但是页面上有可变数量的跨度标记,因此在发生这种偏差时它会失败。
With CreateObject("msxml2.xmlhttp")
.Open "GET", ActiveCell.Offset(0, -1).Value, False
.Send
htm.body.innerhtml = .ResponseText
End With
ExtractedText = htm.getElementById("book-info").getElementsByTagName("span")(1).innerText
回答by ron
This should do it
这应该做
Text_1 = htm.getElementById("book-info").innerhtml
if instr(1, text_1, "MMC_ID ", vbTextCompare) > 0 then
numb_spans = htm.getElementById("book-info").getElementsByTagName("span").length
ExtractedText = htm.getElementById("book-info").getElementsByTagName("span")(-1 + numb_spans).innerText
else
end if
回答by Dick Kusleika
You could loop through all the spans and stop when the child of the next sibling of the next sibling is an anchor element and contains the proper text.
您可以遍历所有跨度并在下一个兄弟的下一个兄弟的子元素是锚元素并包含正确的文本时停止。
Sub test()
Dim htm As HTMLDocument
Dim ExtractedText As String
Dim hSpan As HTMLSpanElement
Dim hAnchor As HTMLAnchorElement
Set htm = New HTMLDocument
With CreateObject("msxml2.xmlhttp")
.Open "GET", "file://///99991-dc01/99991/dkusleika/My%20Documents/test.html", False
.Send
htm.body.innerHTML = .ResponseText
End With
For Each hSpan In htm.getElementById("book-info").getElementsByTagName("span")
On Error Resume Next
Set hAnchor = hSpan.NextSibling.NextSibling.FirstChild
On Error GoTo 0
If Not hAnchor Is Nothing Then
If InStr(1, hAnchor.href, "MMC_ID", vbTextCompare) > 0 Then
ExtractedText = hSpan.innerText
Exit For
End If
End If
Next hSpan
Debug.Print ExtractedText
End Sub
回答by Jean-Fran?ois Corbett
Is it always the lastspan
element? If so, just count how many elements
它总是最后一个span
元素吗?如果是这样,只需计算多少个元素
htm.getElementById("book-info").getElementsByTagName("span")
returns and grab the last one.
返回并抓住最后一个。