如何使用 VBA 读取嵌套表 <td> 值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13159667/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to read the nested table <td> values using VBA?
提问by Arup Rakshit
Anyone help me here by saying that how to read the nested html table values from a third party application using Excel VBA? as an example below i pasted a part of the HTML source where i want to read the all the value and want to store it into the excel.But in here all are nested tables they used,and the tables don't have any name also in the html source i have seen.
任何人在这里帮助我说如何使用 Excel VBA 从第三方应用程序读取嵌套的 html 表值?作为下面的示例,我粘贴了 HTML 源代码的一部分,我想在其中读取所有值并将其存储到 excel 中。在我看到的 html 源代码中。
<td>
<table cellspacing="1" cellpadding="0" class="data">
<tr class="colhead">
<th colspan="3">Expression</th>
</tr>
<tr class="colhead">
<th>Field</th>
<th>Operator</th>
<th>Answer</th>
</tr>
<tr class="rowLight">
<td width="40%">
Location Attributes: LOC - Sub Commodity
</td>
<td width="20%">
=
</td>
<td width="40%">
Abrasives
</td>
</tr>
<tr class="rowDark">
<td width="40%">
Location Attributes: LOC - Commodity Tier1
</td>
<td width="20%">
=
</td>
<td width="40%">
Advertising, Sales & Promotion
</td>
</tr>
Thanks, Arup
谢谢,阿鲁普
回答by user2216436
please find the code below :
请找到以下代码:
Option Explicit
Sub TableExample()
Dim IE As Object
Dim doc As Object
Dim strURL As String
strURL = "[URL="http://example.comwww.dectech.org/football/index.php"]http://example.com[/URL]" ' replace with URL of your choice
Set IE = CreateObject("InternetExplorer.Application")
With IE
'.Visible = True
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
End Sub
Sub GetAllTables(doc As Object)
' get all the tables from a webpage document, doc, and put them in a new worksheet
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
ws.Cells.ClearFormats
End Sub
回答by SeanC
This is how I read an HTML table:
这是我阅读 HTML 表格的方式:
Sub ReadHTMLtable()
Dim htmldb As New ADODB.Connection
Dim htmlcmd As New ADODB.Command
Dim rs As New ADODB.Recordset
With htmldb
.Provider = "Microsoft.Jet.OLEDB.4.0"
.ConnectionString = "Data Source=Z:\filename.html;Extended Properties=""HTML Import;HDR=YES;IMEX=1"""
.Open
End With
Set htmlcmd.ActiveConnection = htmldb
htmlcmd.CommandType = adCmdText
htmlcmd.CommandText = "Select * from [table]"
rs.CursorLocation = adUseClient
rs.CursorType = adOpenDynamic
rs.LockType = adLockOptimistic
rs.Open htmlcmd
'process rs here
End Sub
this uses ADO, but it should be the same for DAO
这使用 ADO,但它应该与 DAO 相同
回答by Brad
This requires references to be set to Microsoft HTML Object Library
and Microsoft Internet Controls
这需要将引用设置为Microsoft HTML Object Library
和Microsoft Internet Controls
Sub Extract_TD_text()
Dim URL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim r As Long
'Saved from www vbaexpress com/forum/forumdisplay.php?f=17
URL = "your file pathe/URL"
Set IE = New InternetExplorer
With IE
.navigate URL
.Visible = True
'Wait for page to load
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set HTMLdoc = .document
End With
Set TDelements = HTMLdoc.getElementsByTagName("TD")
Sheet1.Cells.ClearContents
r = 1
For Each TDelement In TDelements
'Look for required TD elements - this check is specific to VBA Express forum - modify as required
If TDelement.ParentNode.ParentNode.ParentNode.Title = "table you want's title" Then
Sheet1.Range("A1").Offset(r, 0).Formula = "=" & Chr(34) & TDelement.innerText & Chr(34)
r = r + 1
End If
Next
IE.Quit
Set IE = Nothing
End Sub
I based this off of this page
我基于此页面
回答by Rrgg
I looked all over for the answer to this question. I finally found the solution which was actually throuhg recording a macro. I know, you all think you are above this, but it is actually the best way. See the full post here: http://automatic-office.com/?p=344In short, you want to record the macro and go to data --> from web and navigate to your website and select the table you want.
I have used the above solutions "get element by id" type stuff in the past, and it is great for a few elements, but if you want a whole table, and you aren't super experienced, just record a macro.
don't tell your friends and then reformat it to look like your own work so no one knows you used the macro tool ;)
我四处寻找这个问题的答案。我终于找到了实际上是通过录制宏的解决方案。我知道,你们都认为自己在这之上,但这实际上是最好的方式。请参阅此处的完整帖子:http: //automatic-office.com/?p=344简而言之,您想要记录宏并转到数据 --> 从网络并导航到您的网站并选择您想要的表格。
我过去使用过上述解决方案“按 id 获取元素”类型的东西,它对少数元素非常有用,但是如果你想要一个完整的表格,而且你不是非常有经验,只需录制一个宏。不要告诉你的朋友,然后重新格式化它看起来像你自己的作品,所以没有人知道你使用了宏工具;)
The code looks like this (including all the superfluous setting of default properties to their default values that recording a macro does for you... figure out which are extra and delete them
代码看起来像这样(包括将所有多余的默认属性设置为其默认值,录制宏为您所做的......找出哪些是多余的并删除它们
Sub Macro1()
With ActiveSheet.QueryTables.Add(Connection:= _
“URL;http://w1.weather.gov/obhistory/KFRI.html”, Destination:=Range(“$D″))
.Name = “KFRI”
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = “4″
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
Enjoy
享受
回答by ozmike
This is yet another way. The tricky thing is if you have a nested table you must get parent container whether a div, table. Use F12 developer tools in IE or in Chrome and work your way down. In the example below there is a div which contains a table. In the example given in the question there is a table containing a table so you would have to find that parent table and use code similar to this to get the child table. Hope that helps.
这是另一种方式。棘手的是,如果您有一个嵌套表,则必须获取父容器,无论是 div、表。在 IE 或 Chrome 中使用 F12 开发人员工具并按照自己的方式工作。在下面的示例中,有一个包含表格的 div。在问题中给出的示例中,有一个包含表的表,因此您必须找到该父表并使用与此类似的代码来获取子表。希望有帮助。
stabledata = ""
Set oTbl = odiv.getElementsByTagName("TABLE").Item(0)
Set oThead = odiv.getElementsByTagName("THEAD").Item(0)
Set oTRows = oThead.getElementsByTagName("TR").Item(0)
Set oTds = oTRows.getElementsByTagName("TH")
For Each oTd In oTds
DoEvents
stabledata = stabledata & oTd.innertext & Chr(9)
Next oTd
stabledata = stabledata & vbCrLf
Set oTBody = odiv.getElementsByTagName("TBODY").Item(0)
Set oTRows = oTBody.getElementsByTagName("TR")
For Each oTRow In oTRows
DoEvents
Set oTds = oTRow.getElementsByTagName("TD")
For Each oTd In oTds
DoEvents
stabledata = stabledata & oTd.innertext & Chr(9)
Next oTd
stabledata = stabledata & vbCrLf
Next oTRow