VBA:如何在 <td> 标签中获取隐藏的 href
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22311579/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
VBA: How to get a hidden href within <td> tag
提问by Cessna
I'm trying to get the scores from the soccer world cups since 1930, in this link are the match ups listed and by clicking in the team a new window pop up with the summary I'm looking for. My problem is that in the HTML source code I don't see a <a>
tag with the href inside of it as usually is, i'm trying to simulate a click event with the following VBA code:
我正在尝试获取自 1930 年以来足球世界杯的比分,在此链接中列出了比赛列表,单击球队后会弹出一个包含我正在查找的摘要的新窗口。我的问题是,在 HTML 源代码中,我没有<a>
像往常一样看到带有 href的标签,我正在尝试使用以下 VBA 代码模拟点击事件:
Link: http://www.flashscore.com/soccer/world/world-cup-1990/results/
链接:http: //www.flashscore.com/soccer/world/world-cup-1990/results/
Vba code:
VBA 代码:
Sub Test()
Dim URL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim i As Integer
URL = "http://www.flashscore.com/soccer/world/world-cup-1990/results/"
Set IE = New InternetExplorer
With IE
.Navigate URL
.Visible = True
While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set HTMLdoc = .Document
End With
Set TDelements = HTMLdoc.getElementsByTagName("td")
For Each TDelement In TDelements
If TDelement.Title = "Click for match detail!" Then
TDelement.Click
End If
Next
IE.Quit
End Sub
回答by Jerome Montino
I'd do this one differently, as what you're trying to achieve requires a massive overhead and might take you on a wild goose chase of links, actions, loops, and whatnot. Instead of emulating the actions, it's much better emulating the resultsof the actions.
我会以不同的方式做这件事,因为你想要实现的目标需要大量的开销,并且可能会让你对链接、动作、循环等进行疯狂的追逐。与其模仿动作,不如模仿动作的结果。
Take for example the Germany - Argentinamatch. When you click on it, it opens this pop-up:
以德国-阿根廷比赛为例。当你点击它时,它会打开这个弹出窗口:
Of particular note here is the URL of the pop-up. Intuitively enough, the URL has some weird combination that might possibly be a unique identifier. The first thing we check is the tag for that row in the table we just clicked.
这里需要特别注意的是弹出窗口的 URL。直观地说,URL 有一些奇怪的组合,可能是唯一的标识符。我们首先检查的是我们刚刚单击的表中该行的标记。
Check the highlighted part: it resembles the address in our pop-up closely, albeit there are four (4) extra characters at the beginning. Collapsing the row and cross-checking against other rows, we come to the conclusion that the rows use a similar format:
检查突出显示的部分:它与我们弹出窗口中的地址非常相似,尽管开头有四 (4) 个额外字符。折叠行并与其他行进行交叉检查,我们得出的结论是这些行使用了类似的格式:
Removing the g_1_from the id
attribute of the tr
tag gives us the unique identifier for every match. I'll be taking the sanitized id
from the Italy - Englandmatch and I will input it into the generic URL. We'll do this in another tab as the pop-up's address bar is locked:
从标签的属性中删除g_1_为我们提供了每个匹配项的唯一标识符。我将从意大利 - 英格兰比赛中提取经过消毒的内容,并将其输入到通用 URL 中。我们将在另一个选项卡中执行此操作,因为弹出窗口的地址栏已锁定:id
tr
id
Voila. The page is open in another tab, and from there you can get what you want from the match summary. You can even check the head-to-head by changing #match-summary
to #h2h;overall
.
瞧。该页面在另一个选项卡中打开,从那里您可以从比赛摘要中获得您想要的内容。您甚至可以通过更改#match-summary
为 来检查正面交锋#h2h;overall
。
Why does this matter?
为什么这很重要?
Because it's safer. Since you're using Excel, you have the sheets at your control. The safest way is to record all these id
attributes from each row into, say, a column. Then loop over that column, concatenating it properly to produce the target URLs.
因为这样更安全。由于您使用的是 Excel,因此您可以控制工作表。最安全的方法是将id
每一行的所有这些属性记录到一个列中。然后遍历该列,将其正确连接以生成目标 URL。
Then you loop over the URLs and get what you want from the EXACTpages you're looking for. There's no need anymore for the original link/table at this point since almost all the details are in the "summary" pages anyway.
然后您遍历 URL 并从您正在寻找的EXACT页面中获取您想要的内容。此时不再需要原始链接/表格,因为几乎所有的细节都在“摘要”页面中。
Hope this helps and good luck.
希望这会有所帮助并祝你好运。
Follow-up Edit
后续编辑
As I feel a bit guilty not posting any code, here's my attempt at producing the id
values for the World Cup section. Note that I took some liberties with the code, so modify it to suit your approach better. I will only be providing the way to get the id
values. All else is up to you.
由于我对没有发布任何代码感到有点内疚,这是我尝试id
为世界杯部分生成值的尝试。请注意,我对代码使用了一些自由,因此请对其进行修改以更好地适应您的方法。我只会提供获取id
值的方法。其他一切都取决于你。
Sub Test()
Dim URL As String
Dim IE As New InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim dictObj As Object: Set dictObj = CreateObject("Scripting.Dictionary")
Dim tRowID As String
URL = "http://www.flashscore.com/soccer/world/world-cup-1990/results/"
With IE
.Navigate URL
.Visible = True
Do Until .ReadyState = READYSTATE_COMPLETE: DoEvents: Loop
Set HTMLdoc = .Document
End With
With HTMLdoc
'Target the set of tables.
Set tblSet = .getElementById("fs-results")
'Index starts at 0. World Cup Playoffs is (0), World Cup is (1),
'Qualifiers is (2), etc... Targeting tbody right away, as there is
'only one (1) tbody tag per row anyway.
Set mTbl = tblSet.getElementsByTagName("tbody")(1)
Set tRows = mTbl.getElementsByTagName("tr")
With dictObj
'If if value is not yet in dictionary, store it.
For Each tRow In tRows
'Remove the first four (4) characters.
tRowID = Mid(tRow.getAttribute("id"), 5)
If Not .Exists(tRowID) Then
.Add tRowID, Empty
End If
Next tRow
End With
End With
'Print out each of the ID values.
For Each Key In dictObj
Debug.Print Key
Next Key
Set IE = Nothing
End Sub
Which returns the following id
values for the World Cup table (not the Playoffs!):
它返回id
世界杯表(不是季后赛!)的以下值:
04LfR6G7
6RNojjg8
nww8zBgL
t0xCZU8R
dKY0Pgu3
jsY4ODf9
K4YcDdti
2qM7lD3j
bXvdGk2k
tdU0F9He
65hR8YfM
lUsW7hAS
jBqaxXO8
YZY3yivF
n1HjSn11
hvInTSoe
rHUdQZPd
xfS51jmd
ncyhHVnq
4b34kglp
O0kz7LJH
lpQE9EuG
UHlQLBvS
fNQAAfQ9
CKpewD92
EmJrU8Wl
KnT4ETW1
nRnnVlHr
b3T12WXk
jwUc3CIq
n9RDMiPL
pAjW714B
GznS8sk5
GWR6BzB3
UXW8NX9F
beX1CGec
Hope this helps you some more. Kindly make sure you read the comments. Good luck!
希望这对你有更多帮助。请确保您阅读评论。祝你好运!