从 VBA 中的 HTML 源获取数据(excel)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11805389/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 02:04:46  来源:igfitidea点击:

Getting data from HTML source in VBA (excel)

htmlstringexcelvbaexcel-vba

提问by user1575474

I'm trying to collect data from a website, which should be manageable once the source is in string form. Looking around I've assembled some possible solutions but have run into problems with all of them:

我正在尝试从网站收集数据,一旦源为字符串形式,这些数据应该是可管理的。环顾四周,我收集了一些可能的解决方案,但都遇到了问题:

  1. Use InternetExplorer.Application to open the url and then access the inner HTML
  2. Inet
  3. use Shell command to run wget
  1. 使用 InternetExplorer.Application 打开 url 然后访问内部 HTML
  2. 互联网
  3. 使用 Shell 命令运行 wget

Here are the problems I'm having:

以下是我遇到的问题:

  1. When I store the innerHTML into a string, it's not the entire source, only a fraction
  2. ActiveX does not allow the creation of the Inet object (error 429)
  3. I've got the htm into a folder on my computer, how do I get it into a string in VBA?
  1. 当我将innerHTML存储到字符串中时,它不是整个源,只是一小部分
  2. ActiveX 不允许创建 Inet 对象(错误 429)
  3. 我已将 htm 放入计算机上的文件夹中,如何将其放入 VBA 中的字符串中?

Code for 1:

代码 1:

Sub getData() Dim url As String, ie As Object, state As Integer Dim text As Variant, startS As Integer, endS As Integer

Sub getData() Dim url As String, ie As Object, state As Integer Dim text As Variant, startS As Integer, endS As Integer

Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = 0

url = "http://www.eoddata.com/stockquote/NASDAQ/AAPL.htm"
ie.Navigate url


state = 0
Do Until state = 4
    DoEvents
    state = ie.readyState
Loop


text = ie.Document.Body.innerHTML
startS = InStr(ie.Document.Body.innerHTML, "7/26/2012")
endS = InStr(ie.Document.Body.innerHTML, "7/25/2012")


text = Mid(ie.Document.Body.innerHTML, startS, endS - startS)

MsgBox text

回答by jokker

If I were trying to pull the opening price off from 08/10/12 off of that page, which is similar to what I assume you are doing, I'd do something like this:

如果我试图从该页面的 08/10/12 拉开开盘价,这与我假设您正在做的类似,我会做这样的事情:

    Set ie = New InternetExplorer
    With ie
        .navigate "http://eoddata.com/stockquote/NASDAQ/AAPL.htm"
        .Visible = False
        While .Busy Or .readyState <> READYSTATE_COMPLETE
           DoEvents
        Wend
        Set objHTML = .document
        DoEvents
    End With
    Set elementONE = objHTML.getElementsByTagName("TD")
    For i = 1 To elementONE.Length
        elementTWO = elementONE.Item(i).innerText           
        If elementTWO = "08/10/12" Then
            MsgBox (elementONE.Item(i + 1).innerText)
            Exit For
        End If
    Next i
    DoEvents
    ie.Quit
    DoEvents
    Set ie = Nothing

You can modify this to run through the HTML and pull whatever data you want. Iteration +2 would return the high price, etc.

您可以修改它以运行 HTML 并提取您想要的任何数据。迭代 +2 将返回高价等。

Since there are a lot of dates on that page you might also want to make it check that it is between the Recent End of Day Prices and the Company profile.

由于该页面上有很多日期,您可能还需要检查它是否在最近的一天结束价格和公司资料之间。