vba ExcelVBA - HttpReq 通过 MSXML2.XMLHTTP - 加载页面后获取页面

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17230529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 21:48:24  来源:igfitidea点击:

ExcelVBA - HttpReq via MSXML2.XMLHTTP - fetch page after loading page

vbaexcel-vbahttpwebrequestxmlhttprequestexcel

提问by David Nottebohm-Knochenhauer

i have a problem with fetching data from an internal web based Dataservice (cognos). Basically i put together a GET request like "blah.com/cognosapi.dll?product=xxx&date=yyy...", send it to the server and receive a webpage that i can store as HTML and parse into my excel form later.

我在从基于 Web 的内部数据服务 (cognos) 获取数据时遇到问题。基本上我把一个像"blah.com/cognosapi.dll?product=xxx&date=yyy..."这样的 GET 请求放在一起,将它发送到服务器并接收一个我可以存储为 HTML 的网页,然后解析成我的 excel 表单。

I build a VBA program which worked quite well in the past, but the webservice changed an now they are displaying a "your report is running" page in between that lasts from 1sec to 30sec. So when i call my function i always download this "your report is running" page insteat of the data. How can i catch the page that automatically loads up after the "report is running" page?

我构建了一个过去运行良好的 VBA 程序,但 Web 服务发生了变化,现在他们显示“您的报告正在运行”页面,其持续时间为 1 秒到 30 秒。所以当我调用我的函数时,我总是下载这个“你的报告正在运行”页面而不是数据。如何捕捉在“报表正在运行”页面后自动加载的页面?

This is the DownloadFile Function with the GETstring and the target path as parameters.

这是以 GETstring 和目标路径为参数的 DownloadFile 函数。

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean


Dim HttpReq As Object
Set HttpReq = CreateObject("MSXML2.XMLHTTP")

Dim HtmlDoc As New MSHTML.HTMLDocument


HttpReq.Open "GET", sSourceUrl, False
HttpReq.send


If HttpReq.Status = 200 Then
    HttpReq.getAllResponseHeaders
    HtmlDoc.body.innerHTML = HttpReq.responseText
    Debug.Print HtmlDoc.body.innerHTML

End If

  'Download the file. BINDF_GETNEWESTVERSION forces
  'the API to download from the specified source.
  'Passing 0& as dwReserved causes the locally-cached
  'copy to be downloaded, if available. If the API
  'returns ERROR_SUCCESS (0), DownloadFile returns True.

  DownloadFile = URLDownloadToFile(0&, _
                                    sSourceUrl, _
                                    sLocalFile, _
                                    BINDF_GETNEWESTVERSION, _
                                    0&) = ERROR_SUCCESS

End Function

Thanks David

谢谢大卫

回答by David Nottebohm-Knochenhauer

finally you gave me the final link to solve my problem. I baked the code into my DownloadFile Function to stay with the IE Object until the end and then close it.

最后你给了我解决我问题的最终链接。我将代码烘焙到我的 DownloadFile 函数中以与 IE 对象保持一致直到结束,然后将其关闭。

One Error i found is was that the readystate should be polled before anything is done with the HTMLObject.

我发现的一个错误是在对 HTMLObject 做任何事情之前应该轮询就绪状态。

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean

Dim IE As InternetExplorer
Set IE = New InternetExplorer



Dim HtmlDoc As New MSHTML.HTMLDocument
Dim collTables As MSHTML.IHTMLElementCollection
Dim collSpans As MSHTML.IHTMLElementCollection
Dim objSpanElem As MSHTML.IHTMLSpanElement

Dim fnum As Integer

With IE
    'May changed to "false if you don't want to see browser window"
    .Visible = True   
    .Navigate (sSourceUrl)
    'this waits for the page to be loaded
     Do Until .readyState = 4: DoEvents: Loop 
End With

'Set HtmlDoc = wait_for_html(sSourceUrl, "text/css")
Do
    Set HtmlDoc = IE.Document

    'searching for the "Span" tag
    Set collSpans = HtmlDoc.getElementsByTagName("span") 

   'first Span element cotains...
    Set objSpanElem = collSpans(0) 

'... this if loading screen is display
Loop Until Not objSpanElem.innerHTML = "Your report is running." 

'just grab the tables and leave the rest    
Set collTables = HtmlDoc.getElementsByTagName("table") 

fnum = FreeFile()
Open sLocalFile For Output As fnum ' save the file and add html and body tags
Print #fnum, "<html>"
Print #fnum, "<body>"

Print #fnum, collTables(15).outerHTML 'title
Print #fnum, collTables(17).outerHTML 'Date
Print #fnum, collTables(18).outerHTML 'Part, Operation etc.
Print #fnum, collTables(19).outerHTML 'Measuerements

Print #fnum, "</body>"
Print #fnum, "</html>"

Close #fnum
IE.Quit 'close Explorer

DownloadFile = True

End Function

回答by wlgreg

Since you're using a GET request, I'm assuming any required parameters can be provided in the URL string. In that case, you might be able to use InternetExplorer.Application, which should automatically update its Documentproperty whenever the page refreshes. You could then set up a loop which periodically checks for some value (tag text, URL, etc...) that's unique to the desired page.

由于您使用的是 GET 请求,因此我假设 URL 字符串中可以提供任何必需的参数。在这种情况下,您可以使用InternetExplorer.Application,它应该Document在页面刷新时自动更新其属性。然后,您可以设置一个循环,它会定期检查所需页面独有的某些值(标签文本、URL 等)。

Here's a sample which loads a URL, then waits until the page's <title>tag is the desired value.

这是一个加载 URL 的示例,然后等待页面的<title>标记为所需值。

Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)

Function wait_for_html(strURL as String, strDesiredText as String) as String

    Dim IE As InternetExplorer
    Set IE = New InternetExplorer

    IE.Navigate (strURL)

    While IE.ReadyState <> 4
        Sleep 10
    Wend

    Dim objHtml As MSHTML.HTMLDocument
    Dim collTitle As MSHTML.IHTMLElementCollection
    Dim objTitleElem As MSHTML.IHTMLTitleElement

    Do
        Sleep 1000
        Set objHtml = IE.Document
        Set collTitle = objHtml.getElementsByTagName("title")
        Set objTitleElem = collTitle(0)

    Loop Until objTitleElem.Text = strDesiredText

    wait_for_html = objHtml.body.innerHTML

End Function

The above needs references to Microsoft Internet Controls and Microsoft HTML Object Library.

以上需要参考 Microsoft Internet Controls 和 Microsoft HTML Object Library。