使用 Excel-VBA 获取 HTML 源代码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2520949/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting HTML Source with Excel-VBA
提问by l--''''''---------''''''''''''
I would like to direct an excel VBA form to certain URLs, get the HTML source and store that resource in a string. Is this possible, and if so, how do I do it?
我想将一个 excel VBA 表单定向到某些 URL,获取 HTML 源并将该资源存储在一个字符串中。这可能吗,如果可以,我该怎么做?
回答by Gary McGill
Yes. One way to do it is to use the MSXMLDLL - and to do that you need to add a reference to the Microsoft XMLlibrary via Tools->References.
是的。一种方法是使用MSXMLDLL - 为此,您需要Microsoft XML通过Tools->References添加对库的引用。
Here's some code that displays the content of a given URL:
下面是一些显示给定 URL 内容的代码:
Public Sub ShowHTML(ByVal strURL)
On Error GoTo ErrorHandler
Dim strError As String
strError = ""
Dim oXMLHTTP As MSXML2.XMLHTTP
Set oXMLHTTP = New MSXML2.XMLHTTP
Dim strResponse As String
strResponse = ""
With oXMLHTTP
.Open "GET", strURL, False
.send ""
If .Status <> 200 Then
strError = .statusText
GoTo CleanUpAndExit
Else
If .getResponseHeader("Content-type") <> "text/html" Then
strError = "Not an HTML file"
GoTo CleanUpAndExit
Else
strResponse = .responseText
End If
End If
End With
CleanUpAndExit:
On Error Resume Next ' Avoid recursive call to error handler
' Clean up code goes here
Set oXMLHTTP = Nothing
If Len(strError) > 0 Then ' Report any error
MsgBox strError
Else
MsgBox strResponse
End If
Exit Sub
ErrorHandler:
strError = Err.Description
Resume CleanUpAndExit
End Sub
回答by OneOfTheUnemployed
Just an addition to the above response. The question was how to get the HTML source which the stated answer does not actually provide.
只是对上述响应的补充。问题是如何获取所述答案实际上并未提供的 HTML 源代码。
Compare the contents of oXMLHTTP.responseText with the source code in a browser for URL "http://finance.yahoo.com/q/op?s=T+Options". They do not match and even the returned values are different. (This should be executed after hours to avoid changes during the trading day.)
在浏览器中将 oXMLHTTP.responseText 的内容与 URL“http://finance.yahoo.com/q/op?s=T+Options”的源代码进行比较。它们不匹配,甚至返回的值也不同。(这应该在下班后执行,以避免交易日发生变化。)
If I find a way to perform this task the basic code will be posted.
如果我找到执行此任务的方法,将发布基本代码。
回答by ashleedawg
Compact getHTTPfunction
紧凑的getHTTP功能
Below is a compact & generic function that will return HTTP response from a specified URL to, for example:
下面是一个紧凑的通用函数,它将从指定的 URL 返回 HTTP 响应,例如:
- return the
HTMLSource of a web page, JSONresponse from an API URL,- parse a text file at a URL, etc.
- 返回
HTML网页的来源, JSON来自 API URL 的响应,- 在 URL 等处解析文本文件。
This does notrequire any VBA References since MSXML2is used as a late-bound object.
这并没有要求任何VBA参考,因为MSXML2作为一个后期绑定对象。
Public Function getHTTP(ByVal url As String) As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", url, False: .Send
getHTTP = StrConv(.responseBody, vbUnicode)
End With
End Function
Note that this basic function has no validation or error handling, as those are the parts that can vary considerably depending on which URL you're hitting.
请注意,此基本功能没有验证或错误处理,因为这些部分可能会因您点击的 URL 的不同而有很大差异。
If desired, check the value of .Statusafter the .Send) to check for success codes like 0or 200, and also you can setup an error trap with On Error Goto...(never Resume Next!)
如果需要,检查).Status之后的值.Send以检查成功代码,例如0或200,并且您还可以设置错误陷阱On Error Goto...(从不Resume Next!)
Example Usage:
示例用法:
This procedure scrapes thisStack Overflow page for the current score of thisquestion.
此过程会抓取此Stack Overflow 页面以获得此问题的当前分数。
Sub demo_getVoteCount()
Const answerID$ = 2522760
Const url_SO = "https://stackoverflow.com/a/" & answerID
Dim html As String, startPos As Long, voteCount As Variant
html = getHTTP(url_SO) 'get html from url
startPos = InStr(html, "answerid=""" & answerID) 'locate this answer
startPos = InStr(startPos, html, "vote-count-post") 'locate vote count
startPos = InStr(startPos, html, ">") + 1 'locate value
voteCount=Mid(html,startPos,InStr(startPos,html,"<")-startPos) 'extract score
MsgBox "Answer #" & answerID & " has a score of " & voteCount & "."
End Sub
Of course in reality there are far better ways to get the score of an answer than the example above, such as thisway.)
当然,实际上有比上面的例子更好的方法来获得答案的分数,比如这种方式。)

