Html VBA复制网站数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26561527/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 02:56:47  来源:igfitidea点击:

VBA to copy website data

htmlexcelvbaexcel-vba

提问by Julius Stevanus

Can someone help to point me in a right direction on how to copy a particular data from a website to excel sheet via VBA?

有人可以帮助我指出如何通过 VBA 将特定数据从网站复制到 Excel 工作表的正确方向吗?

I tried using macro recorder and web query but it keeps showing error script and the yellow arrow didn't show up at the part that i want to copy.

我尝试使用宏记录器和网络查询,但它一直显示错误脚本并且黄色箭头没有出现在我想要复制的部分。

This is the website that i'm trying to copy http://etfdb.com/etf/EEM/#holdings

这是我试图复制的网站http://etfdb.com/etf/EEM/#holdings

I only want to copy the Top Ten Holdings part.

我只想复制十大控股部分。

Any help would be greatly appreciated. Thank you in advance.

任何帮助将不胜感激。先感谢您。

Edit : This is my current code but nothing showed up, can someone tell me whats wrong?

编辑:这是我当前的代码,但没有显示出来,有人能告诉我出了什么问题吗?

Sub Get123()

Dim oHtml       As HTMLDocument
Dim oElement    As Object

Set oHtml = New HTMLDocument

With CreateObject("WINHTTP.WinHTTPRequest.5.1")
    .Open "GET", "http://etfdb.com/etf/EEM/#holdings", False
    .send
    oHtml.body.innerHTML = .responseText
End With

For Each oElement In oHtml.getElementsByClassName("holdings-left-content")
    ActiveSheet.Range("A1").Value = oElement.Value
Next oElement

End Sub

采纳答案by Portland Runner

I'm not too familiar with WINHTTP request but I'm assuming your having trouble because it's not waiting for a response from the server.

我对 WINHTTP 请求不太熟悉,但我假设您遇到了麻烦,因为它没有等待来自服务器的响应。

I tend to do web scraping this way:

我倾向于以这种方式进行网络抓取:

Sub extract()
    Dim IE As InternetExplorer
    Dim html As HTMLDocument

    Set IE = New InternetExplorerMedium
    IE.Visible = False
    IE.Navigate2 "http://etfdb.com/etf/EEM/#holdings"

    ' Wait while IE loading
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop

    Set html = IE.document
    Set holdingsClass = html.getElementsByClassName("holdings-left-content")

    Range("A1").Value = holdingsClass(0).textContent

    'Cleanup
    IE.Quit
    Set IE = Nothing
End Sub


Make sure you have a reference to:

确保您有以下参考资料:

  1. Microsoft Internet Controls
  2. Microsoft HTML Object Library
  1. 微软互联网控制
  2. 微软 HTML 对象库


Since the class contains a list the return text is all in one element. So the results look like this:

由于该类包含一个列表,因此返回文本都在一个元素中。所以结果是这样的:

enter image description here

在此处输入图片说明



Here is one way to split the results up into different cells :

这是将结果分成不同单元格的一种方法:

Dim results As Variant
results = Split(holdingsClass(0).textContent, vbLf)

cntr = 1
For i = LBound(results) To UBound(results)
    If Trim(results(i)) <> "" Then
        Select Case Right(Trim(results(i)), 1)
            Case ":"
                Range("B" & cntr) = CStr(Trim(results(i)))
            Case "%"
                Range("C" & cntr).Value = Trim(results(i))
                cntr = cntr + 1
            Case 0
                Range("C" & cntr).Value = Trim(results(i))
            Case Else
                Range("A" & cntr).Value = Trim(results(i))
        End Select
    End If
Next i


Results:

结果:

enter image description here

在此处输入图片说明



Explanation

解释

getElements...returns an array of all html elements that meet the given criteria. In this case it returns all elements with the class name "holdings-left-content".

getElements...返回满足给定条件的所有 html 元素的数组。在这种情况下,它返回类名称为“holdings-left-content”的所有元素。

Since there is only one element with this class name we access the first element using (0)because it's a zero based array (0,1,2 for 3 elements instead of 1,2,3).

由于只有一个具有此类名称的元素,因此我们使用(0)它访问第一个元素,因为它是一个基于零的数组(0,1,2 表示 3 个元素,而不是 1,2,3)。

The Splitmethod takes all text in first array element and separates each line into another array (results) using the carriage return vbLfas a delimiter.

Split方法获取第一个数组元素中的所有文本,并使用回车vbLf作为分隔符将每一行分隔到另一个数组(结果)中。

Now we just loop through the results array and display each line of text. The Select Casejust helps us know which column to display the next line of text for a nicely formatted display.

现在我们只需遍历结果数组并显示每一行文本。在Select Case刚刚帮助我们知道显示文本的下一行的一个很好的格式化显示的列。

回答by M--

I tried this method here but it wasn't working for me. I found Pull Web Page Into Worksheeton Ozgridfrom user JerryDand I am including it here, for future reference.

我在这里尝试了这种方法,但它对我不起作用。我在Ozgrid 上从用户JerryD找到了将网页拉入工作表并将其包含在此处,以供将来参考。

Sub Test() 

    Dim IE As Object 

    Sheets("Sheet3").Select 
    Range("A1:A1000") = "" ' erase previous data
    Range("A1").Select 

    Set IE = CreateObject("InternetExplorer.Application") 
        With IE 
            .Visible = True 
            .Navigate "http://www.aarp.org/" ' should work for any URL
            Do Until .ReadyState = 4: DoEvents: Loop 
        End With 

        IE.ExecWB 17, 0 '// SelectAll
        IE.ExecWB 12, 2 '// Copy selection
        ActiveSheet.PasteSpecial Format:="Text", link:=False, DisplayAsIcon:=False 
        Range("A1").Select 
        IE.Quit 

End Sub