Linux wget 如何仅保存从目标页面链接的页面链接到的某些文件类型？

Question

提问by Nomen

How can wget save only certain file types linked to from pages linked to by the target page, regardless of the domain in which the certain files are?

wget 如何只保存从目标页面链接的页面链接到的某些文件类型，而不管某些文件所在的域？

Trying to speed up a task I have to do often.

试图加快我必须经常做的任务。

I've been rooting through the wget docs and googling, but nothing seems to work. I keep on either getting just the target page or the subpages without the files (even using -H), so I'm obviously doing badly at this.

我一直在浏览 wget 文档和谷歌搜索，但似乎没有任何效果。我继续要么只获取目标页面，要么获取没有文件的子页面（甚至使用 -H），所以我显然在这方面做得很差。

So, essentially, example.com/index1/ contains links to example.com/subpage1/ and example.com/subpage2/, while the subpages contain links to example2.com/file.ext and example2.com/file2.ext, etc. However, example.com/index1.html may link to example.com/index2/ which has links to more subpages I don't want.

因此，本质上，example.com/index1/ 包含指向 example.com/subpage1/ 和 example.com/subpage2/ 的链接，而子页面包含指向 example2.com/file.ext 和 example2.com/file2.ext 等的链接. 但是，example.com/index1.html 可能会链接到 example.com/index2/，其中包含指向更多我不想要的子页面的链接。

Can wget even do this, and if not then what do you suggest I use? Thanks.

wget 甚至可以做到这一点，如果不能，那么您建议我使用什么？谢谢。

Answer 1

回答by ssapkota

Something like this should Work:

这样的事情应该工作：

wget --accept "*.ext" --level 2 "example.com/index1/"

Answer 2

回答by TheKojuEffect

Following command worked for me.

以下命令对我有用。

wget -r --accept "*.ext" --level 2 "example.com/index1/"

Need to do recursively so -rshould be added.

需要递归执行所以-r应该添加。

Linux wget 如何仅保存从目标页面链接的页面链接到的某些文件类型？

提问by Nomen

回答by ssapkota

回答by TheKojuEffect

相关推荐

最近更新

标签

Linux wget 如何仅保存从目标页面链接的页面链接到的某些文件类型？

提问by Nomen

回答by ssapkota

回答by TheKojuEffect

相关推荐

C# 无法将“System.DBNull”类型的对象转换为“System.String”类型

Linux 如何在 Bash 中对齐空格分隔表的列？

使用 C# 写入 Excel 文件的最佳和最快方法是什么？

Linux 在 svn 中提交文件后如何删除有关存储未加密密码的警告

相关推荐

最近更新

标签