Html 尝试在 Excel 中使用 VBA 从网页中提取一个值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23594067/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Trying to extract ONE value from a webpage with VBA in Excel
提问by Rycket
I've been trying to find the information now for a couple of days, but all the examples I've found just has a small piece of the code, I need it all =)
几天来我一直在尝试查找信息,但是我找到的所有示例都只有一小段代码,我需要全部 =)
What I want to do is to extract one value from a homepage and put it into a cell in Excel (and then take another value from another page on the same site and put in the next cell etc etc.)
我想要做的是从主页中提取一个值并将其放入 Excel 中的一个单元格中(然后从同一站点的另一个页面中获取另一个值并放入下一个单元格等)
The page is a swedish stock-exchange page, and the page I've used as a test-page is the stock for "Investor B" (https://www.avanza.se/aktier/om-aktien.html/5247/investor-b)
该页面是瑞典证券交易所页面,我用作测试页面的页面是“投资者 B”的股票(https://www.avanza.se/aktier/om-aktien.html/5247 /投资者-b)
And the value I'm interested in is the one called "Senaste" (this is the page-information surrounding it)
我感兴趣的值是一个叫做“Senaste”的值(这是围绕它的页面信息)
<li>
<span class="XSText">Senast<br/></span>
<span class="lastPrice SText bold"><span class="pushBox roundCorners3" title="Senast uppdaterad: 17:29:59">248,60</span></span>
</li>
And it's the value 248,60 I'm after!
这就是我所追求的价值 248,60!
I got some coding experience, but not for VBA-scripting, after reading some forum-posts (mostly here), I've been trying out a few example by myself, but couldn't get any to work. Since I'm quite basic with VBA, I might have got the structure wrong, so please be basic and patient with me, this was my test, but I got "Runtime error 429" ActiveX component can't create object
我有一些编码经验,但不是 VBA 脚本,在阅读了一些论坛帖子(主要是这里)之后,我自己尝试了一些例子,但没有任何工作。由于我对 VBA 非常基础,我可能弄错了结构,所以请对我保持基本和耐心,这是我的测试,但我得到“运行时错误 429”ActiveX 组件无法创建对象
I might be totally on the wrong track
我可能完全在错误的轨道上
Private Sub CommandButton1_Click()
Dim ie As Variant
Set ie = CreateObject("InternetExplorer")
ie.navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"
ie.Visible = True
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Application.Wait (Now() + TimeValue("00:00:016")) ' For internal page refresh or loading
Dim doc As Variant 'variable for document or data which need to be extracted out of webpage
Set doc = CreateObject("HTMLDocument")
Set doc = ie.document
Dim dd As Variant
dd = doc.getElementsByClassName("lastPrice SText bold")(0).innerText
MsgBox dd
End Sub
EDIT: 2014-05-12 Current code beeing tested 17:05
编辑:2014-05-12 当前代码正在测试 17:05
under the button command
在按钮命令下
Private Sub CommandButton1_Click()
Dim IE As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
' You can uncoment Next line To see form results
IE.Visible = False
' Send the form data To URL As POST binary request
IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"
' Statusbar
Application.StatusBar = "Loading, Please wait..."
' Wait while IE loading...
'Do While IE.Busy
' Application.Wait DateAdd("s", 1, Now)
'Loop
'this should go from ready-busy-ready
IEWait IE
Application.StatusBar = "Searching for value. Please wait..."
' Dim Document As HTMLDocument
' Set Document = IE.Document
Dim dd As Variant
dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText
MsgBox dd
' Show IE
IE.Visible = True
' Clean up
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing
Application.StatusBar = ""
End Sub
And in module1
在模块 1 中
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Public Function IEWait(p_ieExp As InternetExplorer)
'this should go from ready-busy-ready
Dim initialReadyState As Integer
initialReadyState = p_ieExp.ReadyState
'wait 250 ms until it's done
Do While p_ieExp.Busy Or p_ieExp.ReadyState <> READYSTATE_COMPLETE
Sleep 250
Loop
End Function
As said earlier, I do not know if I got the structure right with this latest add-in, not to expired in this kind of coding I'm afraid.
如前所述,我不知道这个最新的加载项的结构是否正确,恐怕不会在这种编码中过期。
Best Regards
此致
Stop editing 2014-05-12 17:08
停止编辑 2014-05-12 17:08
回答by Portland Runner
You are close but have a couple small errors.
你很接近,但有几个小错误。
Here is how I would set it up (Tested):
这是我将如何设置它(已测试):
Private Sub CommandButton1_Click()
Dim IE As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
' You can uncoment Next line To see form results
IE.Visible = False
' URL to get data from
IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"
' Statusbar
Application.StatusBar = "Loading, Please wait..."
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
Application.StatusBar = "Searching for value. Please wait..."
Dim dd As String
dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText
MsgBox dd
' Show IE
IE.Visible = True
' Clean up
Set IE = Nothing
Application.StatusBar = ""
End Sub
Results:
结果:
Tested in Excel 2010 with the following references:
在 Excel 2010 中使用以下参考进行测试:
Edit - Option B
编辑 - 选项 B
To get rid of a possible "Run-Time Error '91'" try changing a few lines like this:
要摆脱可能的“运行时错误 '91'”,请尝试更改如下几行:
Dim dd As Variant
Set dd = IE.Document.getElementsByClassName("lastPrice SText bold")
MsgBox dd(0).textContent
Edit - Option C
编辑 - 选项 C
Yet another way to get elements:
另一种获取元素的方法:
Dim tag
Dim tags As Object
Set tags = IE.Document.getElementsByTagName("*")
For Each tag In tags
If tag.className = "lastPrice SText bold" Then
MsgBox tag.innerText
Exit For
End If
Next tag
(All three methods have been tested on Excel 2010 and IE10)
(三种方法均已在 Excel 2010 和 IE10 上测试过)
回答by Rycket
I just wanted to add the code I'm currently running which works perfectly fine at the moment, if people run into the same problem. This is to get two values into dedicated cells.
我只是想添加我目前正在运行的代码,如果人们遇到同样的问题,该代码目前运行良好。这是为了将两个值放入专用单元格中。
Private Sub CommandButton10_Click()
Dim IE As Object
Dim dd As Variant
' Create InternetExplorer Object
Set IE = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
IE.Visible = False
' Send the form data To URL As POST binary request
IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/52476/alk-abell-b"
Application.StatusBar = "Loading, Please wait..."
IEWait IE
Application.StatusBar = "Searching for value. Please wait..."
dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText
Range("Y2").Value = dd
IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/52380/alm--brand"
Application.StatusBar = "Loading, Please wait..."
IEWait IE
Application.StatusBar = "Searching for value. Please wait..."
dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText
Range("Y3").Value = dd
' Clean up
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing
Application.StatusBar = ""
End Sub
If one wants more data, it is just to copy the part starting with IE.Navigate "https://www.pagewhereyourdatayouwanttoextractis.com"and stops with Range("Y2").Value = dd
如果想要更多数据,只需复制以IE.Navigate " https://www.pagewhereyourdatayouwanttoextractis.com"开头 并以 Range("Y2").Value = dd结尾的部分
It is ofcourse based if the page you want to extract data from has a similiar structure to the one above.
如果您要从中提取数据的页面与上面的页面具有类似的结构,它当然是基于。
Hope this can help some people out there.
希望这可以帮助那里的一些人。
Best Regards
此致