vba VBA获取href值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32677931/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 09:57:57  来源:igfitidea点击:

VBA to get the href value

vbawebweb-scrapinghrefextract

提问by Nicholas Kan

I am writing macro to extract the href value from a website, example here is to extract the value: '/listedco/listconews/SEHK/2015/0429/LTN201504291355_C.pdf' from the html code below. The href is one of the attributes of the html tag 'a', I have add the code getElementbyTagName'a' but it did not work, my question is how to extract that href value to column L. Anyone could help? Thanks in advance!

我正在编写宏以从网站中提取 href 值,这里的示例是从下面的 html 代码中提取值:'/listedco/listconews/SEHK/2015/0429/LTN201504291355_C.pdf'。href 是 html 标记 'a' 的属性之一,我添加了代码 getElementbyTagName'a' 但它不起作用,我的问题是如何将该 href 值提取到 L 列。有人可以帮忙吗?提前致谢!

  <a id="ctl00_gvMain_ctl03_hlTitle" class="news" href="/listedco/listconews/SEHK/2015/0429/LTN201504291355_C.pdf" target="_blank">二零一四年年報</a>

Sub Download_From_HKEX()
    Dim internetdata As Object
    Dim div_result As Object
    Dim header_links As Object
    Dim link As Object
    Dim URL As String
    Dim IE As Object
    Dim i As Object
    Dim ieDoc As Object
    Dim selectItems As Variant
    Dim h As Variant

    Dim LocalFileName As String
    Dim B As Boolean
    Dim ErrorText As String
    Dim x As Variant

    'Key Ratios
    For x = 1 To 1579
        Set IE = New InternetExplorerMedium
        IE.Visible = True
        URL = "http://www.hkexnews.hk/listedco/listconews/advancedsearch/search_active_main_c.aspx"
        IE.navigate URL
        Do
            DoEvents
        Loop Until IE.readyState = 4
        Application.Wait (Now + TimeValue("0:00:05"))
        Call IE.Document.getElementById("ctl00_txt_stock_code").setAttribute("value", Worksheets("Stocks").Cells(x, 1).Value)

        Set selectItems = IE.Document.getElementsByName("ctl00$sel_tier_1")
        For Each i In selectItems
            i.Value = "4"
            i.FireEvent ("onchange")
        Next i

        Set selectItems = IE.Document.getElementsByName("ctl00$sel_tier_2")
        For Each i In selectItems
            i.Value = "159"
            i.FireEvent ("onchange")
        Next i

        Set selectItems = IE.Document.getElementsByName("ctl00$sel_DateOfReleaseFrom_d")
        For Each i In selectItems
            i.Value = "01"
            i.FireEvent ("onchange")
        Next i

        Set selectItems = IE.Document.getElementsByName("ctl00$sel_DateOfReleaseFrom_m")
        For Each i In selectItems
            i.Value = "04"
            i.FireEvent ("onchange")
        Next i

        Set selectItems = IE.Document.getElementsByName("ctl00$sel_DateOfReleaseFrom_y")
        For Each i In selectItems
            i.Value = "1999"
            i.FireEvent ("onchange")
        Next i

        Application.Wait (Now + TimeValue("0:00:02"))
        Set ieDoc = IE.Document
        With ieDoc.forms(0)
            Call IE.Document.parentWindow.execScript("document.forms[0].submit()", "JavaScript")
            .submit
        End With
        Application.Wait (Now + TimeValue("0:00:03"))

        'Start here to extract the href value.
        Set internetdata = IE.Document
        Set div_result = internetdata.getElementById("ctl00_gvMain_ctl03_hlTitle")
        Set header_links = div_result.getElementsByTagName("a")
        For Each h In header_links
            Set link = h.ChildNodes.Item(0)
            Worksheets("Stocks").Cells(Range("L" & Rows.Count).End(xlUp).Row + 1, 12) = link.href
        Next
    Next x
End Sub

采纳答案by Tim Williams

For Each h In header_links
     Worksheets("Stocks").Cells(Range("L" & Rows.Count).End(xlUp).Row + 1, 12) = h.href
Next

EDIT: The idattribute is supposed to be unique in the document: there should only be a single element with any given id. So

编辑:该id属性在文档中应该是唯一的:应该只有一个具有任何给定 id 的元素。所以

IE.Document.getElementById("ctl00_gvMain_ctl03_hlTitle").href

should work.

应该管用。

回答by user7551055

WB.Document.GetElementById("ctl00_gvMain_ctl04_hlTitle").GetAttribute("href").ToString

回答by QHarr

Use a CSS selector to get the element then access its hrefattribute.

使用 CSS 选择器获取元素,然后访问其href属性。

#ctl00_gvMain_ctl03_hlTitle

The above is element with id ctl00_gvMain_ctl03_hlTitle. "#"means id.

以上是带有 的元素id ctl00_gvMain_ctl03_hlTitle"#"表示身。

Debug.Print IE.document.querySelector("#ctl00_gvMain_ctl03_hlTitle").href