vba 自动化错误：800706B5、80004005、80010108 出现用于内部 SAP 站点抓取

Question

提问by derigible

I am writing a macro that will scrape my company's internal SAP site for vendor information. For several reasons I have to use VBA to do so. However, I cannot figure out why I keep getting these three errors when I attempt to scrape the page. Is it possible that this has something to do with the UAC integrity model? Or is there something wrong with my code? Is it possible for a webpage using http can be handled differently in internet explorer? I am able to go to any webpage, even other internal webpages, and can scrape each of those just fine. But when i attempt to scrape the SAP page, i get these errors. The error descriptions and when they occur are:

我正在编写一个宏，它将抓取我公司的内部 SAP 站点以获取供应商信息。出于多种原因，我必须使用 VBA 来做到这一点。但是，当我尝试抓取页面时，我无法弄清楚为什么会不断收到这三个错误。这可能与UAC 完整性模型有关吗？还是我的代码有问题？是否可以在 Internet Explorer 中以不同方式处理使用 http 的网页？我可以访问任何网页，甚至其他内部网页，并且可以很好地抓取每个网页。但是当我尝试抓取 SAP 页面时，我收到了这些错误。错误描述及其发生时间是：

800706B5 - The interface is unknown (occurs when I place breakpoints before running the offending code)

800706B5 - 接口未知（在运行违规代码之前放置断点时发生）

80004005 - Unspecified error (occurs when I don't place any errors and just let the macro run)

80004005 - 未指定的错误（当我没有放置任何错误并让宏运行时发生）

80010108 - The Object invoked has disconnected from its clients. (I can't seem to get a consistent occurrence of this error, it seems to happen around the time that something in excel is so corrupted that no page will load and i have to reinstall excel)

80010108 - 调用的对象已与其客户端断开连接。（我似乎无法始终如一地出现此错误，它似乎发生在 excel 中的某些内容已损坏以至于无法加载任何页面并且我必须重新安装 excel 的时候）

I have absolutely no idea what is going on. The Integrity page didn't make much sense to me, and all the research I found on this talked about connecting to databases and using ADO and COM references. However I am doing everything through Internet Explorer. Here is my relevant code below:

我完全不知道发生了什么。Integrity 页面对我来说没有多大意义，我在这方面找到的所有研究都谈到了连接到数据库以及使用 ADO 和 COM 引用。但是我通过 Internet Explorer 做所有事情。这是我的相关代码如下：

Private Sub runTest_Click()
   ie.visible = True
   doScrape
End Sub
'The code to run the module
Private Sub doTest()
   Dim result As String
   result = PageScraper.scrapeSAPPage("<some num>")
End Sub

PageScraper Module

PageScraper 模块

Public Function scrapeSAPPage(num As Long) As String
   'Predefined URL that appends num onto end to navigate to specific record in SAP
   Dim url As String: url = "<url here>" 
   Dim ie as InternetExplorer
   set ie = CreateObject("internetexplorer.application")
   Dim doc as HTMLDocument

   ie.navigate url 'Will always sucessfully open page, regardless of SAP or other
   'pauses the exection of the code until the webpage has loaded
   Do
     'Will always fail on next line when attempting SAP site with error
     If Not ie.Busy And ie.ReadyState = 4 Then 
        Application.Wait (Now + TimeValue("00:00:01"))
        If Not ie.Busy And ie.ReadyState = 4 Then
           Exit Do
        End If
     End If
     DoEvents
   Loop

   Set doc = ie.document 'After implementation of Tim Williams changes, breaks here
   'Scraping code here, not relevant

 End Function

I am using IE9 and Excel 2010 on a Windows 7 machine. Any help or insight you can provide would be greatly appreciated. Thank you.

我在 Windows 7 机器上使用 IE9 和 Excel 2010。您可以提供的任何帮助或见解将不胜感激。谢谢你。

Answer 1

采纳答案by Raymond

I do this type of scraping frequently and have found it very difficult to make IE automation work 100% reliably with errors like those you have found. As they are often timing issues it can be very frustrating to debug as they don't appear when you step through, only during live runs To minimize the errors I do the following:

我经常进行这种类型的抓取，并且发现很难使 IE 自动化 100% 可靠地工作，并且会出现您发现的错误。由于它们通常是计时问题，因此调试时可能会非常令人沮丧，因为它们不会在您逐步执行时出现，仅在实时运行期间为了尽量减少错误，我执行以下操作：

Introduce more delays; ie.busy and ie.ReadyState don't necessarily give valid answers IMMEDIATELY after an ie.navigate, so introduce a short delay after ie.navigate. For things I'm loading 1 to 2 seconds normally but anything over 500ms seems to work.

引入更多延迟；ie.busy 和 ie.ReadyState 在 ie.navigate 之后不一定立即给出有效答案，因此在 ie.navigate 之后引入一个短暂的延迟。对于我通常加载 1 到 2 秒的内容，但任何超过 500 毫秒的内容似乎都有效。

Make sure IE is in a clean state by going ie.navigate "about:blank" before going to the target url.

在转到目标 url 之前，通过 ie.navigate "about:blank" 确保 IE 处于干净状态。

After that you should have a valid IE object and you'll have to look at it to see what you've got inside. Generally I avoid trying to access the entire ie.document and instead use IE.document.all.tags("x") where 'x' is a suitable thing I'm looking for such as td or a.

之后，您应该拥有一个有效的 IE 对象，并且您必须查看它以了解里面的内容。通常，我避免尝试访问整个 ie.document，而是使用 IE.document.all.tags("x") ，其中 'x' 是我正在寻找的合适的东西，例如 td 或 a。

However after all these improvements although they have increased my success rate I still have errors at random.

然而，经过所有这些改进，虽然它们提高了我的成功率，但我仍然随机出现错误。

My real solution has been to abandon IE and instead do my work using xmlhttp.

我真正的解决方案是放弃 IE，而是使用 xmlhttp 来完成我的工作。

If you are parsing out your data using text operations on the document then it will be a no-brainer to swap over. The xmlhttp object is MUCH more reliable. and you just get the "responsetext" to access the entire html of the document.

如果您使用文档上的文本操作来解析您的数据，那么交换将是一件轻而易举的事。xmlhttp 对象更可靠。并且您只需获得“响应文本”即可访问文档的整个 html。

Here is a simplified version of what I'm using in production now for scraping, it's so reliable it runs overnight generating millions of rows without error.

这是我现在在生产中用于抓取的简化版本，它非常可靠，可以在一夜之间运行，生成数百万行而不会出错。

Public Sub Main()

Dim obj As MSXML2.ServerXMLHTTP
Dim strData As String
Dim errCount As Integer

' create an xmlhttp object - you will need to reference to the MS XML HTTP library, any version will do
' but I'm using Microsoft XML, v6.0 (c:\windows\system32\msxml6.dll)
Set obj = New MSXML2.ServerXMLHTTP

' Get the url - I set the last param to Async=true so that it returns right away then lets me wait in
' code rather than trust it, but on an internal network "false" might be better for you.
obj.Open "GET", "http://www.google.com", True
obj.send ' this line actually does the HTTP GET

' Wait for a completion up to 10 seconds
errCount = 0
While obj.readyState < 4 And errCount < 10
    DoEvents
    obj.waitForResponse 1 ' this is an up-to-one-second delay
    errCount = errCount + 1
Wend

If obj.readyState = 4 Then ' I do these on two
    If obj.Status = 200 Then ' different lines to avoid certain error cases
        strData = obj.responseText
    End If
End If

obj.abort  ' in real code I use some on error resume next, so at this point it is possible I have a failed
           ' get and so best to abort it before I try again

Debug.Print strData

End Sub

Hope that helps.

希望有帮助。

vba 自动化错误：800706B5、80004005、80010108 出现用于内部 SAP 站点抓取

提问by derigible

采纳答案by Raymond

相关推荐

最近更新

标签

vba 自动化错误：800706B5、80004005、80010108 出现用于内部 SAP 站点抓取

提问by derigible

采纳答案by Raymond

相关推荐

vba 在 Excel 中创建删除线宏

访问 VBA - 为声明为 long 的函数返回某种空白/空值

具有可变列宽的 Excel VBA 导入 TXT 文件

vba 循环内的 Case 语句

相关推荐

最近更新

标签