我可以使用 wget 从 linux 终端下载多个文件吗

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6827459/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 05:16:15  来源:igfitidea点击:

Can i use wget to download multiple files from linux terminal

linuxdownloadcentos

提问by Mirage

Suppose i have a directory accessible via http e,g

假设我有一个可通过 http e,g 访问的目录

Http://www.abc.com/pdf/books

Http://www.abc.com/pdf/books

Inside the folder i have many pdf files

在文件夹内我有很多pdf文件

Can i use something like

我可以使用类似的东西吗

wget http://www.abc.com/pdf/books/*

wget http://www.abc.com/pdf/books/*

采纳答案by merryprankster

wget -r -l1 -A.pdf http://www.abc.com/pdf/books

回答by Jerry Tian

from wget man page:

来自 wget 手册页:

   Wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site.  This is
   sometimes referred to as ``recursive downloading.''  While doing that, Wget respects the Robot Exclusion Standard (/robots.txt).  Wget can be instructed to convert the
   links in downloaded HTML files to the local files for offline viewing.
   Wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site.  This is
   sometimes referred to as ``recursive downloading.''  While doing that, Wget respects the Robot Exclusion Standard (/robots.txt).  Wget can be instructed to convert the
   links in downloaded HTML files to the local files for offline viewing.

and

   Recursive Retrieval Options
   Recursive Retrieval Options
   -r
   --recursive
       Turn on recursive retrieving.

   -l depth
   --level=depth
       Specify recursion maximum depth level depth.  The default maximum depth is 5.

回答by Soren

It depends on the webserver and the configuration of the server. Strictly speaking the URL is not a directory path, so the http://something/books/*is meaningless.

这取决于网络服务器和服务器的配置。严格来说,URL 不是目录路径,所以http://something/books/*没有意义。

However if the web server implements the path of http://something/booksto be a index page listing all the books on the site, then you can play around with the recursive optionand spider options and wget will be happy to follow any links which is in the http://something/booksindex page.

但是,如果 Web 服务器实现了将路径http://something/books设为列出站点上所有书籍的索引页,那么您可以使用递归选项和蜘蛛选项,wget 将很乐意跟踪http://something/books索引页中的任何链接。