从 VBA 中的 HTML 源获取数据(excel)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11805389/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting data from HTML source in VBA (excel)
提问by user1575474
I'm trying to collect data from a website, which should be manageable once the source is in string form. Looking around I've assembled some possible solutions but have run into problems with all of them:
我正在尝试从网站收集数据,一旦源为字符串形式,这些数据应该是可管理的。环顾四周,我收集了一些可能的解决方案,但都遇到了问题:
- Use InternetExplorer.Application to open the url and then access the inner HTML
- Inet
- use Shell command to run wget
- 使用 InternetExplorer.Application 打开 url 然后访问内部 HTML
- 互联网
- 使用 Shell 命令运行 wget
Here are the problems I'm having:
以下是我遇到的问题:
- When I store the innerHTML into a string, it's not the entire source, only a fraction
- ActiveX does not allow the creation of the Inet object (error 429)
- I've got the htm into a folder on my computer, how do I get it into a string in VBA?
- 当我将innerHTML存储到字符串中时,它不是整个源,只是一小部分
- ActiveX 不允许创建 Inet 对象(错误 429)
- 我已将 htm 放入计算机上的文件夹中,如何将其放入 VBA 中的字符串中?
Code for 1:
代码 1:
Sub getData() Dim url As String, ie As Object, state As Integer Dim text As Variant, startS As Integer, endS As Integer
Sub getData() Dim url As String, ie As Object, state As Integer Dim text As Variant, startS As Integer, endS As Integer
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = 0
url = "http://www.eoddata.com/stockquote/NASDAQ/AAPL.htm"
ie.Navigate url
state = 0
Do Until state = 4
DoEvents
state = ie.readyState
Loop
text = ie.Document.Body.innerHTML
startS = InStr(ie.Document.Body.innerHTML, "7/26/2012")
endS = InStr(ie.Document.Body.innerHTML, "7/25/2012")
text = Mid(ie.Document.Body.innerHTML, startS, endS - startS)
MsgBox text
回答by jokker
If I were trying to pull the opening price off from 08/10/12 off of that page, which is similar to what I assume you are doing, I'd do something like this:
如果我试图从该页面的 08/10/12 拉开开盘价,这与我假设您正在做的类似,我会做这样的事情:
Set ie = New InternetExplorer
With ie
.navigate "http://eoddata.com/stockquote/NASDAQ/AAPL.htm"
.Visible = False
While .Busy Or .readyState <> READYSTATE_COMPLETE
DoEvents
Wend
Set objHTML = .document
DoEvents
End With
Set elementONE = objHTML.getElementsByTagName("TD")
For i = 1 To elementONE.Length
elementTWO = elementONE.Item(i).innerText
If elementTWO = "08/10/12" Then
MsgBox (elementONE.Item(i + 1).innerText)
Exit For
End If
Next i
DoEvents
ie.Quit
DoEvents
Set ie = Nothing
You can modify this to run through the HTML and pull whatever data you want. Iteration +2 would return the high price, etc.
您可以修改它以运行 HTML 并提取您想要的任何数据。迭代 +2 将返回高价等。
Since there are a lot of dates on that page you might also want to make it check that it is between the Recent End of Day Prices and the Company profile.
由于该页面上有很多日期,您可能还需要检查它是否在最近的一天结束价格和公司资料之间。