vba 从网站/网页下载/保存文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15614498/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 20:13:43  来源:igfitidea点击:

Download / Save file from Web Site / Web Page

vbaexcel-vbaexcel

提问by maximladus

I need to download the PDF files from the link below for the first/top 5 dates and save them on Desktop for instance. I have no clue how to start but also couldn't find something explicit on Google.

我需要从下面的链接下载第一个/前 5 个日期的 PDF 文件,并将它们保存在桌面上。我不知道如何开始,但也无法在 Google 上找到明确的内容。

Do you think you can help me?

你觉得你能帮我吗?

http://cetatenie.just.ro/ordine/articol-11/

http://cetatenie.just.ro/ordine/articol-11/

回答by user2185045

I would use Internet Explorer, and automate it using an SHDocVw.InternetExplorer object (VBA reference 'Microsoft Internet Controls', ieframe.dll).

我会使用 Internet Explorer,并使用 SHDocVw.InternetExplorer 对象(VBA 参考“Microsoft Internet Controls”,ieframe.dll)自动执行。

You can either (a) create a new Internet Explorer window using Set x = New SHDocVw.InternetExploreror (b) acquire an existing Internet Explorer window using Set owins = CreateObject("Shell.Application").Windows(owinsis an array, loop through it until you find one where Mid(TypeName(owins(i).Document), 1, 12) = "HTMLDocument").

您可以 (a) 使用 (a) 创建一个新的 Internet Explorer 窗口,Set x = New SHDocVw.InternetExplorer或者 (b) 使用Set owins = CreateObject("Shell.Application").Windows(owins是一个数组,循环遍历它直到找到一个 where Mid(TypeName(owins(i).Document), 1, 12) = "HTMLDocument")获取现有的 Internet Explorer 窗口。

Once you have an Internet Explorer ie, you can call ie.Navigate(url)to go to a website.

拥有 Internet Explorer 后ie,您可以致电ie.Navigate(url)访问网站。

To wait for Internet Explorer to finish navigating before you interrogate it, you can run something like:

要在询问 Internet Explorer 之前等待它完成导航,您可以运行以下命令:

Do While mascot_win.Busy
    Application.Wait DateAdd("s", 1, Now)
    DoEvents
Loop

To get the URLs for the first five PDFs on that page, you'd need to examine the HTML of the page. There are two ways, depending on how well-formed the HTML is. If the HTML is well-written, then you can navigate the Document Object Model (the tags, like XML) with ie.Document.all(). But if the HTML is not well-formed, you may have to resort to reading the HTML from ie.Document.all(0).innerHTML.

要获取该页面上前五个 PDF 的 URL,您需要检查该页面的 HTML。有两种方法,具体取决于 HTML 的格式。如果 HTML 编写得很好,那么您可以使用ie.Document.all(). 但是,如果 HTML 格式不正确,您可能不得不求助于从ie.Document.all(0).innerHTML.

By the looks of the link you gave, you will be looking for things like:

根据您提供的链接的外观,您将寻找以下内容:

<li>Data de <strong>22.03.2013</strong>, numarul: <a href="/wp-content/uploads/Ordin-149P-din-22.03.2013.pdf">149P</a></li>

Once you have isolated each PDF URL (using either the attribute of the <a>tag in the DOM model or using lots of Mid()calls on the HTML), you can download it using:

一旦您隔离了每个 PDF URL(使用<a>DOM 模型中的标签属性或Mid()对 HTML进行大量调用),您可以使用以下方法下载它:

Private Declare Function URLDownloadToFile _
Lib "urlmon" _
Alias "URLDownloadToFileA" _
( _
    ByVal pCaller As Long, _
    ByVal szURL As String, _
    ByVal szFileName As String, _
    ByVal dwReserved As Long, _
    ByVal lpfnCB As Long _
) As Long

Dim ss As String
Dim ts As String
ss = "http://blah/blah/blah.pdf"
ts = "c:\meh\blah.pdf"
URLDownloadToFile 0, ss, ts, 0, 0