vba 使用VBA从网站上的表格中检索<TD>标签并放入excel
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43458256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Retreiving <TD> tag from table on website using VBA and put into excel
提问by ShanayL
I am trying to retrieve information from a <TD>
tag on a website.
我正在尝试从<TD>
网站上的标签中检索信息。
It works but I cant seem to get the text from the second <td>
tag in a <TR>
tag while using a conditional statement to get the second tag as this is he only way I see that works. The code works fine to extract information I just cant figure out how to access that second with the condition that I have found a match in the first <td>
.
它有效,但我似乎无法从<td>
标签中的第二个标签中获取文本,<TR>
同时使用条件语句获取第二个标签,因为这是我认为可行的唯一方法。该代码可以很好地提取信息,我只是无法弄清楚如何使用我在第一个中找到匹配项的条件访问第二个<td>
.
So the actual html table would look like this.
所以实际的 html 表看起来像这样。
<html>
<head></head>
<body>
<table id="Table2">
<tr>
<td class="tSystemRight">System Name: -if this matches</td>
<td class="tSystemLeft breakword">Windows3756 -I need this</td>
</tr>
<tr>
<td class="tSystemRight">System Acronym: -if this matches</td>
<td class="tSystemLeft breakword">WIN37 -I need this</td>
</tr>
</table>
</body>
</html>
The VBA script I have is:
我拥有的 VBA 脚本是:
excelRow = 2
For Each tr In msxml.tableRows
cellCount = 1
For Each TD In tr.getElementsByTagName("TD")
If ((cellCount = 1) And (TD.innerText = "System Acronym:")) Then
Worksheets("Data").Cells(excelRow, 2).value = Cells(1, 2)
ElseIf ((cellCount = 1) And (TD.innerText = "System Name:")) Then
Worksheets("Data").Cells(excelRow, 3).value = Cells(1, 2)
cellCount = cellCount + 1
End If
Next
Next
This just displays System Name:
and System Acronym:
in the excel sheet
这仅显示System Name:
并System Acronym:
在 Excel 表中
采纳答案by barrowc
If you have a td
element and you want to get the inner text of the next td
in the row then use the nextSibling
property, like this:
如果您有一个td
元素并且想要获取td
行中下一个的内部文本,请使用该nextSibling
属性,如下所示:
For Each td In tr.getElementsByTagName("TD")
If ((cellCount = 1) And (td.innerText = "System Acronym:")) Then
Worksheets("Data").Cells(excelRow, 2).Value = td.NextSibling.innerText
ElseIf ((cellCount = 1) And (td.innerText = "System Name:")) Then
Worksheets("Data").Cells(excelRow, 3).Value = td.NextSibling.innerText
cellCount = cellCount + 1
End If
Next
Next
Note that nothing in the given code is changing the value of excelRow
so everything will keep getting written into the same row. Also note that the HTML given has the "System Name" first and the "System Acronym" second whereas the code seems to be structured to look for "System Acronym" first and "System Name" second
请注意,给定代码中的excelRow
任何内容都不会更改 的值,因此所有内容都将继续写入同一行。另请注意,给定的 HTML 首先是“系统名称”,然后是“系统首字母缩略词”,而代码的结构似乎是首先查找“系统首字母缩略词”,然后是“系统名称”
回答by Scott Holtzman
I developed the following from a public website with almost identical structure to yours. (https://www.federalreserve.gov/releases/h3/current/)
我从一个与您的结构几乎相同的公共网站开发了以下内容。( https://www.federalreserve.gov/releases/h3/current/)
Requires Reference to Microsoft Internet Controls
and Microsoft HTML Object Library
需要参考Microsoft Internet Controls
和Microsoft HTML Object Library
Option Explicit
Sub Test()
Dim ie As New InternetExplorer
Dim doc As New HTMLDocument
With ie
.Visible = True
.Navigate "https://www.federalreserve.gov/releases/h3/current/"
'can place code to wait for IE to load here .. I skipped it since its not in direct focus of question
Set doc = .Document
Dim t As HTMLTable
Dim r As HTMLTableRow
Dim c As HTMLTableCol
Set t = doc.getElementById("t1tg1")
'loop through each row
For Each r In t.Rows
If r.Cells(0).innerText = "Mar. 2016" Then Debug.Print r.Cells(1).innerText
'loop through each column in the row
'For Each c In r.Cells
' Debug.Print c.innerText
'Next
Next
End With
End Sub
All that said, after setting your specific table like I have above, I suggest the following edit to your code (I have left out the cellcount check and other stuff):
综上所述,在像我上面那样设置您的特定表格后,我建议对您的代码进行以下编辑(我省略了 cellcount 检查和其他内容):
For Each r In t.Rows
'find out which columns System Acronym and value will be and modify the Cells(n) statements
If r.Cells(0).innerText = "System Acronym:" Then Worksheets("Data").Cells(excelRow, 2).Value = r.Cells(2).innerText
Next