vba ExcelVBA - HttpReq 通过 MSXML2.XMLHTTP - 加载页面后获取页面
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17230529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ExcelVBA - HttpReq via MSXML2.XMLHTTP - fetch page after loading page
提问by David Nottebohm-Knochenhauer
i have a problem with fetching data from an internal web based Dataservice (cognos). Basically i put together a GET request like "blah.com/cognosapi.dll?product=xxx&date=yyy...", send it to the server and receive a webpage that i can store as HTML and parse into my excel form later.
我在从基于 Web 的内部数据服务 (cognos) 获取数据时遇到问题。基本上我把一个像"blah.com/cognosapi.dll?product=xxx&date=yyy..."这样的 GET 请求放在一起,将它发送到服务器并接收一个我可以存储为 HTML 的网页,然后解析成我的 excel 表单。
I build a VBA program which worked quite well in the past, but the webservice changed an now they are displaying a "your report is running" page in between that lasts from 1sec to 30sec. So when i call my function i always download this "your report is running" page insteat of the data. How can i catch the page that automatically loads up after the "report is running" page?
我构建了一个过去运行良好的 VBA 程序,但 Web 服务发生了变化,现在他们显示“您的报告正在运行”页面,其持续时间为 1 秒到 30 秒。所以当我调用我的函数时,我总是下载这个“你的报告正在运行”页面而不是数据。如何捕捉在“报表正在运行”页面后自动加载的页面?
This is the DownloadFile Function with the GETstring and the target path as parameters.
这是以 GETstring 和目标路径为参数的 DownloadFile 函数。
Public Function DownloadFile(sSourceUrl As String, _
sLocalFile As String) As Boolean
Dim HttpReq As Object
Set HttpReq = CreateObject("MSXML2.XMLHTTP")
Dim HtmlDoc As New MSHTML.HTMLDocument
HttpReq.Open "GET", sSourceUrl, False
HttpReq.send
If HttpReq.Status = 200 Then
HttpReq.getAllResponseHeaders
HtmlDoc.body.innerHTML = HttpReq.responseText
Debug.Print HtmlDoc.body.innerHTML
End If
'Download the file. BINDF_GETNEWESTVERSION forces
'the API to download from the specified source.
'Passing 0& as dwReserved causes the locally-cached
'copy to be downloaded, if available. If the API
'returns ERROR_SUCCESS (0), DownloadFile returns True.
DownloadFile = URLDownloadToFile(0&, _
sSourceUrl, _
sLocalFile, _
BINDF_GETNEWESTVERSION, _
0&) = ERROR_SUCCESS
End Function
Thanks David
谢谢大卫
回答by David Nottebohm-Knochenhauer
finally you gave me the final link to solve my problem. I baked the code into my DownloadFile Function to stay with the IE Object until the end and then close it.
最后你给了我解决我问题的最终链接。我将代码烘焙到我的 DownloadFile 函数中以与 IE 对象保持一致直到结束,然后将其关闭。
One Error i found is was that the readystate should be polled before anything is done with the HTMLObject.
我发现的一个错误是在对 HTMLObject 做任何事情之前应该轮询就绪状态。
Public Function DownloadFile(sSourceUrl As String, _
sLocalFile As String) As Boolean
Dim IE As InternetExplorer
Set IE = New InternetExplorer
Dim HtmlDoc As New MSHTML.HTMLDocument
Dim collTables As MSHTML.IHTMLElementCollection
Dim collSpans As MSHTML.IHTMLElementCollection
Dim objSpanElem As MSHTML.IHTMLSpanElement
Dim fnum As Integer
With IE
'May changed to "false if you don't want to see browser window"
.Visible = True
.Navigate (sSourceUrl)
'this waits for the page to be loaded
Do Until .readyState = 4: DoEvents: Loop
End With
'Set HtmlDoc = wait_for_html(sSourceUrl, "text/css")
Do
Set HtmlDoc = IE.Document
'searching for the "Span" tag
Set collSpans = HtmlDoc.getElementsByTagName("span")
'first Span element cotains...
Set objSpanElem = collSpans(0)
'... this if loading screen is display
Loop Until Not objSpanElem.innerHTML = "Your report is running."
'just grab the tables and leave the rest
Set collTables = HtmlDoc.getElementsByTagName("table")
fnum = FreeFile()
Open sLocalFile For Output As fnum ' save the file and add html and body tags
Print #fnum, "<html>"
Print #fnum, "<body>"
Print #fnum, collTables(15).outerHTML 'title
Print #fnum, collTables(17).outerHTML 'Date
Print #fnum, collTables(18).outerHTML 'Part, Operation etc.
Print #fnum, collTables(19).outerHTML 'Measuerements
Print #fnum, "</body>"
Print #fnum, "</html>"
Close #fnum
IE.Quit 'close Explorer
DownloadFile = True
End Function
回答by wlgreg
Since you're using a GET request, I'm assuming any required parameters can be provided in the URL string. In that case, you might be able to use InternetExplorer.Application
, which should automatically update its Document
property whenever the page refreshes. You could then set up a loop which periodically checks for some value (tag text, URL, etc...) that's unique to the desired page.
由于您使用的是 GET 请求,因此我假设 URL 字符串中可以提供任何必需的参数。在这种情况下,您可以使用InternetExplorer.Application
,它应该Document
在页面刷新时自动更新其属性。然后,您可以设置一个循环,它会定期检查所需页面独有的某些值(标签文本、URL 等)。
Here's a sample which loads a URL, then waits until the page's <title>
tag is the desired value.
这是一个加载 URL 的示例,然后等待页面的<title>
标记为所需值。
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Function wait_for_html(strURL as String, strDesiredText as String) as String
Dim IE As InternetExplorer
Set IE = New InternetExplorer
IE.Navigate (strURL)
While IE.ReadyState <> 4
Sleep 10
Wend
Dim objHtml As MSHTML.HTMLDocument
Dim collTitle As MSHTML.IHTMLElementCollection
Dim objTitleElem As MSHTML.IHTMLTitleElement
Do
Sleep 1000
Set objHtml = IE.Document
Set collTitle = objHtml.getElementsByTagName("title")
Set objTitleElem = collTitle(0)
Loop Until objTitleElem.Text = strDesiredText
wait_for_html = objHtml.body.innerHTML
End Function
The above needs references to Microsoft Internet Controls and Microsoft HTML Object Library.
以上需要参考 Microsoft Internet Controls 和 Microsoft HTML Object Library。