vba 如何从VBA中的img标签解析src

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25323412/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-12 04:15:05  来源:igfitidea点击:

How to parse src from img tag in VBA

htmlexcelvbaexcel-vbahtml-parsing

提问by Tdev

I have a question relating to HTML parsing. I have a website with some products and I would like to catch the URL from images into my current spreadsheet. This spreadsheet is quite big but contains ItemNbr in 3rd column, I expect the URL in the 27th column and one row corresponds to one product (item).

我有一个关于 HTML 解析的问题。我有一个包含一些产品的网站,我想将图像中的 URL 捕获到我当前的电子表格中。该电子表格相当大,但在第 3 列中包含 ItemNbr,我希望第 27 列中的 URL 和一行对应于一个产品(项目)。

My idea is to fetch the URL of 'regular' OR 'large' OR 'verylarge' images (It doesn't really matter). Here is the structure of the website (among various other div):

我的想法是获取“常规”或“大”或“非常大”图像的 URL(这并不重要)。这是网站的结构(以及其他各种 div):

<div id="MainDisplay" class="miMaindisplay">
    <a href="http://www.example.com/verylarge/12425/nl" id="ctl00_PageContent_MultiImage_jqzoom" class="loupe">
        <div class="zoomPad">
            <img src="http://www.example.com/regular/12425/nl" id="ctl00_PageContent_MultiImage_PreviewImage" class="miPreviewImage">
            <div class="zoomPup"></div>
            <div class="zoomWindow">
                <div class="zoomWrapper">
                    <div class="zoomWrapperTitle"></div>
                    <div class="zoomWrapperImage">
                        <img src="http://www.example.com/large/12425/nl">
                    </div>
                </div>
            </div>
            <div class="zoomPreload">Loading zoom</div>
        </div>
    </a>
</div>

I could get the URL in the JS console with this line:

我可以使用以下行在 JS 控制台中获取 URL:

document.getElementById('ctl00_PageContent_MultiImage_PreviewImage').src;

And the answer is:

答案是:

http://www.example.com/regular/12425/nl

But without success in VBA. Here is my code snippet:

但在 VBA 中没有成功。这是我的代码片段:

Sub ParseImage()

    Dim Cell As Integer
    Dim ItemNbr As String

    Dim AElement As Object
    Dim AElements As IHTMLElementCollection

    Dim IE As MSXML2.XMLHTTP60
    Set IE = New MSXML2.XMLHTTP60

    Dim HTMLDoc As MSHTML.HTMLDocument
    Dim HTMLBody As MSHTML.HTMLBody

    Set HTMLDoc = New MSHTML.HTMLDocument
    Set HTMLBody = HTMLDoc.body

    For Cell = 1 To 5                            'I iterate through the file row by row

        ItemNbr = Cells(Cell, 3).Value           'ItemNbr are in the 3rd Column of my spreadsheet

        IE.Open "GET", "http://www.example.com/?item=" & ItemNbr, False
        IE.send

        While IE.ReadyState <> 4
            DoEvents
        Wend

        HTMLBody.innerHTML = IE.responseText

        Set AElements = HTMLDoc.getElementsByTagName("a")
        For Each AElement In AElements
            If AElement.id = "ctl00_PageContent_MultiImage_PreviewImage" Then
                Cells(Cell, 27) = AElement.src     'I write URL in the 27th column
            End If
        Next AElement

        Application.Wait (Now + TimeValue("0:00:2"))

Next Cell

End Sub

结束子

I obviously included some references as follows:

我显然包含了一些参考资料,如下所示:

References

参考

Thank you for your help!

感谢您的帮助!

回答by IAmDranged

If the elements you are targetting are identified by an id in your HTML page, the more straightforward way to get to it is to use the getElementById method of the HTML document object.

如果您的目标元素由 HTML 页面中的 id 标识,则更直接的方法是使用 HTML 文档对象的 getElementById 方法。

Try and replace this section

尝试替换此部分

Set AElements = HTMLDoc.getElementsByTagName("a")
For Each AElement In AElements
    If AElement.id = "ctl00_PageContent_MultiImage_PreviewImage" Then
        Cells(Cell, 27) = AElement.src     'I write URL in the 27th column
    End If
Next AElement

with something like

set previewImg = HTMLDoc.getElementById("ctl00_PageContent_MultiImage_PreviewImage")
If not previewImg is Nothing then Cells(Cell, 27) = previewImg.getAttribute("src")