vba 使用VBA访问iframe中的对象

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44902558/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-12 12:50:31  来源:igfitidea点击:

Accessing object in iframe using VBA

htmlexcelvbaiframeweb-scraping

提问by user1

To the point:

重点:

I have successfully used VBA to do the following:

我已成功使用 VBA 执行以下操作:

  • Login to a website using getElementsByName

  • Select parameters for the report that will be generated (using getelementsby...)

  • generating the report after selecting parameters which renders the resulting dataset into an iframe on the same page
  • 使用 getElementsByName 登录网站

  • 为将生成的报告选择参数(使用 getelementsby...)

  • 选择参数后生成报告,将结果数据集渲染到同一页面上的 iframe 中

Important to note - The website is client-side

需要注意的重要事项 - 该网站是客户端

The above was the simple part, the difficult part is as below:

以上是简单的部分,难的部分如下:

clicking on a gif image within the iframe that exports the dataset to a csv

单击 iframe 中的 gif 图像,将数据集导出到 csv

I have tried the following:

我尝试了以下方法:

Dim idoc As HTMLDocument
Dim iframe As HTMLFrameElement
Dim iframe2 As HTMLDocument

Set idoc = objIE.document
Set iframe = idoc.all("iframename")
Set iframe2 = iframe.contentDocument

    Do Until InStr(1, objIE.document.all("iframename").contentDocument.innerHTML, "img.gif", vbTextCompare) = 0
        DoEvents
    Loop

To give some context to the logic above -

为上面的逻辑提供一些上下文 -

  • I accessed the main frame
  • i accessed the iframe by its name element
  • i accessed the content within the iframe
  • I attempted to find the gif image that needs to be clicked to export to csv
  • 我访问了主框架
  • 我通过它的名称元素访问了 iframe
  • 我访问了 iframe 中的内容
  • 我试图找到需要单击才能导出到 csv 的 gif 图像

It is at this line that it trips up saying "Object doesn't support this property or method"

正是在这一行,它跳出说“对象不支持此属性或方法”

Also tried accessing the iframe gif by the a element and href attribute but this totally failed. I also tried grabbing the image from its source URL but all this does it take me to the page the image is from.

还尝试通过 a 元素和 href 属性访问 iframe gif 但这完全失败了。我还尝试从其源 URL 中抓取图像,但所有这些都将我带到了图像所在的页面。

note: the iframe does not have an ID and strangely the gif image does not have an "onclick" element/event

注意:iframe 没有 ID,奇怪的是 gif 图像没有“onclick”元素/事件

Final consideration - attempted scraping the iframe using R

最后考虑 - 尝试使用 R 抓取 iframe

accessing the HTML node of the iframe was simple, however trying to access the attributes of the iframe and subsequently the nodes of the table proved unsuccessful. All it returned was "Character(0)"

访问 iframe 的 HTML 节点很简单,但是尝试访问 iframe 的属性以及随后表的节点证明是不成功的。它返回的只是“字符(0)”

library(rvest)
library(magrittr)

Blah <-read_html("web address redacted") %>%
  html_nodes("#iframe")%>%
  html_nodes("#img")%>%
  html_attr("#src")%>%
  #read_html()%>%
  head()
Blah

As soon as a i include read_html the following error returns on the script:

只要 ai 包含 read_html,脚本就会返回以下错误:

Error in if (grepl("<|>", x)) { : argument is of length zero

if (grepl("<|>", x)) { 错误:参数长度为零

I suspect this is referring to the Character(0)

我怀疑这是指 Character(0)

Appreciate any guidance here!

感谢这里的任何指导!

Many Thanks,

非常感谢,

HTML

HTML

<div align="center"> 
    <table id="table1" style="border-collapse: collapse" width="700" cellspacing="0" cellpadding="0" border="0"> 
        <tbody>
            <tr>
                <td colspan="6"> &nbsp;</td>
            </tr> 
            <tr> 
                <td colspan="6"> 
                    <a href="href redacted">
                        <img src="img.gif" width="38" height="38" border="0" align="right">
                    </a>
                    <strong>x - </strong>
                </td>
            </tr> 
        </tbody>
    </table>
</div>

采纳答案by dee

It is sometimes tricky with iframes. Based on htmlyou provided I have created this example. Which works locally, but would it work for you as well?

有时很棘手iframes。根据html您提供的信息,我创建了这个示例。哪个在本地有效,但对你也有效吗?

To get to the IFramethe framescollection can be used. Hope you know the nameof the IFrame?

要到IFrameframes可以用来收藏。希望你知道nameIFrame

Dim iframeDoc As MSHTML.HTMLDocument
Set iframeDoc = doc.frames("iframename").document

Then to go the the imagewe can use querySelectormethod e.g. like this:

然后去image我们可以使用的querySelector方法,例如:

Dim img As MSHTML.HTMLImg
Set img = iframeDoc.querySelector("div table[id='table1'] tbody tr td a[href^='https://stackoverflow.com'] img")

The selector a[href^='https://stackoverflow.com']selects anchorwhich has an hrefattribute which starts with given text. The ^denotes the beginning.

选择器a[href^='https://stackoverflow.com']选择anchor具有href以给定文本开头的属性。^代表开始

Then when we have the image just a simple call to clickon its parent which is the desired anchor. HTH

然后,当我们拥有图像时,只需click对其父级进行简单的调用,这是所需的anchor. HTH



Complete example:

完整示例:

Option Explicit

' Add reference to Microsoft Internet Controls (SHDocVw)
' Add reference to Microsoft HTML Object Library

Sub Demo()

    Dim ie As SHDocVw.InternetExplorer
    Dim doc As MSHTML.HTMLDocument
    Dim url As String

    url = "file:///C:/Users/dusek/Documents/My Web Sites/mainpage.html"
    Set ie = New SHDocVw.InternetExplorer
    ie.Visible = True
    ie.navigate url

    While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE
        DoEvents
    Wend

    Set doc = ie.document

    Dim iframeDoc As MSHTML.HTMLDocument
    Set iframeDoc = doc.frames("iframename").document
    If iframeDoc Is Nothing Then
        MsgBox "IFrame with name 'iframename' was not found."
        ie.Quit
        Exit Sub
    End If

    Dim img As MSHTML.HTMLImg
    Set img = iframeDoc.querySelector("div table[id='table1'] tbody tr td a[href^='https://stackoverflow.com'] img")
    If img Is Nothing Then
        MsgBox "Image element within iframe was not found."
        ie.Quit
        Exit Sub
    Else
        img.parentElement.Click
    End If

    ie.Quit
End Sub

Main page HTML used

使用的主页 HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<!-- saved from url=(0016)http://localhost -->
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>x -</title>
</head>

<body>
<iframe name="iframename" src="iframe1.html">
</iframe>
</body>

</html>

IFrame HTML used (saved as file iframe1.html)

使用的 IFrame HTML(另存为文件iframe1.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<!-- saved from url=(0016)http://localhost -->
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>Untitled 2</title>
</head>

<body>
<div align="center"> 
    <table id="table1" style="border-collapse: collapse" width="700" cellspacing="0" cellpadding="0" border="0"> 
        <tbody>
            <tr>
                <td colspan="6"> &nbsp;</td>
            </tr> 
            <tr> 
                <td colspan="6"> 
                    <a href="https://stackoverflow.com/questions/44902558/accessing-object-in-iframe-using-vba">
                        <img src="img.gif" width="38" height="38" border="0" align="right">
                    </a>
                    <strong>x - </strong>
                </td>
            </tr> 
        </tbody>
    </table>
</div>

</body>

</html>

回答by QHarr

I thought I would expand on the answer already given.

我想我会扩展已经给出的答案。

In the case of Internet Explorer you may have one of two common situations to handle regarding iframes.

对于 Internet Explorer,您可能需要处理有关 iframe 的两种常见情况之一。

1) src of iframe is subject to same origin policy restrictions:

1) iframe 的 src 受同源策略限制:

The iframe src has a different origin to the landing page in which case, due to same origin policy, attempts to access it will yield access denied.

iframe src 与着陆页的来源不同,在这种情况下,由于同源策略,尝试访问它会导致访问被拒绝

Resolution:

解析度:

Consider using selenium basic to automate a different browser such as Chrome where CORS is allowed/you can switch to the iframe and continue working with the iframe document

考虑使用 selenium basic 来自动化不同的浏览器,例如允许 CORS 的 Chrome/您可以切换到 iframe 并继续使用 iframe 文档

Example:

例子:

Option Explicit
'download selenium https://github.com/florentbr/SeleniumBasic/releases/tag/v2.0.9.0
'Ensure latest applicable driver e.g. ChromeDriver.exe in Selenium folder
'VBE > Tools > References > Add reference to selenium type library
Public Sub Example()
    Dim d As WebDriver
    Const URL As String = "https://www.rosterresource.com/mlb-roster-grid/"
    Set d = New ChromeDriver
    With d
        .Start "Chrome"
        .get URL
        .SwitchToFrame .FindElementByCss("iframe") '< pass the iframe element as the identifier argument
        ' .SwitchToDefaultContent ''to go back to parent document.
        Stop '<== delete me later
        .Quit
    End With
End Sub


2) src of iframe is not subject to same origin policy restrictions:

2) iframe 的 src 不受同源策略限制:

Resolution:

解析度:

The methods as detailed in answer already given. Additionally, you can extract the src of the iframeand .Navigate2that to access

已经给出的答案中详述的方法。此外,您还可以提取的在srciframe.Navigate2那个访问

.Navigate2 .document.querySelector("iframe").src

If you only want to work with the contents of the iframe then simply do your initial .Navigate2the iframe srcand don't even visit the initial landing page

如果你只是想用iframe的内容工作,然后简单地做您最初.Navigate2iframe src,甚至不访问初始着陆页

Example:

例子:

Option Explicit
Public Sub NavigateUsingSrcOfIframe()
    Dim IE As New InternetExplorer
    With IE
        .Visible = True
        .Navigate2 "http://www.bursamalaysia.com/market/listed-companies/company-announcements/5978065"

        While .Busy Or .readyState < 4: DoEvents: Wend

        .Navigate2 .document.querySelector("iframe").src

        While .Busy Or .readyState < 4: DoEvents: Wend

        Stop '<== delete me later
        .Quit
    End With
End Sub


3) iframe in ShadowRoot

3)ShadowRoot中的iframe

An unlikely case might be an iframein shadowroot. You should really have one or the otherand not one within the other.

不太可能的情况可能是iframeshadowroot 中。你真的应该有一个或另一个,而不是另一个。

Resolution:

解析度:

In that case you need an additional accessor of

在这种情况下,您需要一个额外的访问器

Element.shadowRoot.querySelector("iframe").contentDocument

where Elementis your parent element with shadowRootattached. This method will only work if the shadowRootmodeis set to Open.

附加Element的父元素在哪里shadowRoot。此方法仅在shadowRootmode设置为时才有效Open

Example:

例子:

To follow

跟随