vba 使用VBA从网站上的表格中检索<TD>标签并放入excel

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43458256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-12 12:28:16  来源:igfitidea点击:

Retreiving <TD> tag from table on website using VBA and put into excel

htmlexcelvbaexcel-vbaconditional

提问by ShanayL

I am trying to retrieve information from a <TD>tag on a website.

我正在尝试从<TD>网站上的标签中检索信息。

It works but I cant seem to get the text from the second <td>tag in a <TR>tag while using a conditional statement to get the second tag as this is he only way I see that works. The code works fine to extract information I just cant figure out how to access that second with the condition that I have found a match in the first <td>.

它有效,但我似乎无法从<td>标签中的第二个标签中获取文本,<TR>同时使用条件语句获取第二个标签,因为这是我认为可行的唯一方法。该代码可以很好地提取信息,我只是无法弄清楚如何使用我在第一个中找到匹配项的条件访问第二个<td>.

So the actual html table would look like this.

所以实际的 html 表看起来像这样。

<html>
<head></head>
<body>
<table id="Table2">
<tr>
  <td class="tSystemRight">System Name: -if this matches</td>
  <td class="tSystemLeft breakword">Windows3756 -I need this</td>
</tr>
<tr>
  <td class="tSystemRight">System Acronym: -if this matches</td>
  <td class="tSystemLeft breakword">WIN37  -I need this</td>
</tr>
</table>
</body>
</html>

The VBA script I have is:

我拥有的 VBA 脚本是:

excelRow = 2

For Each tr In msxml.tableRows
cellCount = 1
   For Each TD In tr.getElementsByTagName("TD")
    If ((cellCount = 1) And (TD.innerText = "System Acronym:")) Then
       Worksheets("Data").Cells(excelRow, 2).value = Cells(1, 2)
    ElseIf ((cellCount = 1) And (TD.innerText = "System Name:")) Then
       Worksheets("Data").Cells(excelRow, 3).value = Cells(1, 2)
    cellCount = cellCount + 1
    End If
   Next
Next

This just displays System Name:and System Acronym:in the excel sheet

这仅显示System Name:System Acronym:在 Excel 表中

采纳答案by barrowc

If you have a tdelement and you want to get the inner text of the next tdin the row then use the nextSiblingproperty, like this:

如果您有一个td元素并且想要获取td行中下一个的内部文本,请使用该nextSibling属性,如下所示:

For Each td In tr.getElementsByTagName("TD")
    If ((cellCount = 1) And (td.innerText = "System Acronym:")) Then
       Worksheets("Data").Cells(excelRow, 2).Value = td.NextSibling.innerText
    ElseIf ((cellCount = 1) And (td.innerText = "System Name:")) Then
       Worksheets("Data").Cells(excelRow, 3).Value = td.NextSibling.innerText
    cellCount = cellCount + 1
    End If
   Next
Next

Note that nothing in the given code is changing the value of excelRowso everything will keep getting written into the same row. Also note that the HTML given has the "System Name" first and the "System Acronym" second whereas the code seems to be structured to look for "System Acronym" first and "System Name" second

请注意,给定代码中的excelRow任何内容都不会更改 的值,因此所有内容都将继续写入同一行。另请注意,给定的 HTML 首先是“系统名称”,然后是“系统首字母缩略词”,而代码的结构似乎是首先查找“系统首字母缩略词”,然后是“系统名称”

回答by Scott Holtzman

I developed the following from a public website with almost identical structure to yours. (https://www.federalreserve.gov/releases/h3/current/)

我从一个与您的结构几乎相同的公共网站开发了以下内容。( https://www.federalreserve.gov/releases/h3/current/)

Requires Reference to Microsoft Internet Controlsand Microsoft HTML Object Library

需要参考Microsoft Internet ControlsMicrosoft HTML Object Library

Option Explicit

Sub Test()

Dim ie As New InternetExplorer
Dim doc As New HTMLDocument

With ie

    .Visible = True
    .Navigate "https://www.federalreserve.gov/releases/h3/current/"

    'can place code to wait for IE to load here .. I skipped it since its not in direct focus of question

    Set doc = .Document

    Dim t As HTMLTable
    Dim r As HTMLTableRow
    Dim c As HTMLTableCol

    Set t = doc.getElementById("t1tg1")

    'loop through each row
    For Each r In t.Rows

        If r.Cells(0).innerText = "Mar. 2016" Then Debug.Print r.Cells(1).innerText

        'loop through each column in the row
        'For Each c In r.Cells

        '    Debug.Print c.innerText

        'Next

    Next

End With

End Sub

All that said, after setting your specific table like I have above, I suggest the following edit to your code (I have left out the cellcount check and other stuff):

综上所述,在像我上面那样设置您的特定表格后,我建议对您的代码进行以下编辑(我省略了 cellcount 检查和其他内容):

For Each r In t.Rows

    'find out which columns System Acronym and value will be and modify the Cells(n) statements          
    If r.Cells(0).innerText = "System Acronym:" Then Worksheets("Data").Cells(excelRow, 2).Value = r.Cells(2).innerText

Next