bash 多个带有 404 检测的 cURL 下载

Question

提问by natehome

I'm trying to write a shell script that downloads files using cURL and detects if a url leads to a 404 error. If a url is a 404 then I want to save the url link or filename to a text file.

我正在尝试编写一个 shell 脚本，该脚本使用 cURL 下载文件并检测 url 是否导致 404 错误。如果 url 是 404，那么我想将 url 链接或文件名保存到文本文件中。

The url format is http://server.com/somefile[00-31].txt

url 格式为http://server.com/somefile[00-31].txt

I've been messing around with what I've found on google and currently have the following code:

我一直在搞乱我在谷歌上找到的东西，目前有以下代码：

#!/bin/bash

if curl --fail -O "http://server.com/somefile[00-31].mp4"
then
  echo "$var" >> "files.txt"    
else
  echo "Saved!"
fi

Answer 1

回答by denisvm

#!/bin/bash

URLFORMAT="http://server.com/somefile%02d.txt"

for num in {0..31}; do
    # build url and try to download
    url=$(printf "$URLFORMAT" $num)
    CURL=$(curl --fail -O "$url" 2>&1)

    # check if error and 404, and save in file
    if [ $? -ne 0 ]; then
        echo $CURL | grep --quiet 'The requested URL returned error: 404'
        [ $? -eq 0 ] && echo "$url" >> "files.txt"
    fi

done

Here is a version which uses the arguments for URLs:

这是一个使用 URL 参数的版本：

#!/bin/sh

for url in "$@"; do
    CURL=$(curl --fail -O "$url" 2>&1)
    # check if error and 404, and save in file
    if [ $? -ne 0 ]; then
        echo $CURL | grep --quiet 'The requested URL returned error: 404'
        [ $? -eq 0 ] && echo "$url" >> "files.txt"
    fi
done

You can use this like: sh download-script.sh http://server.com/files{00..21}.png http://server.com/otherfiles{00..12}.gif

你可以这样使用：sh download-script.sh http://server.com/files{00..21}.png http://server.com/otherfiles{00..12}.gif

The range expansion will work on bash shells.

范围扩展将适用于 bash shell。

Answer 2

回答by Stephen Quan

You can use the -D, --dump-header <file>option to capture all headers, including Content-Type and HTTP Status Code. Note that the headers may be terminated with a DOS newline (CR LF), so you may want to strip the CR character.

您可以使用该-D, --dump-header <file>选项来捕获所有标头，包括 Content-Type 和 HTTP 状态代码。请注意，标头可能以 DOS 换行符 (CR LF) 终止，因此您可能需要去除 CR 字符。

curl -s -D headers.txt -o out.dat "http://server.com/somefile[00-31].mp4"
httpStatus=$(head -1 headers.txt | awk '{print }')
contentType=$(grep "Content-Type:" headers.txt | tr -d '\r')
contentType=${contentType#*: }
if [ "$httpStatus" != "200" ]; then
    echo "FAILED - HTTP STATUS $httpStatus"
else
    echo "SAVED"
fi

bash 多个带有 404 检测的 cURL 下载

提问by natehome

回答by denisvm

回答by Stephen Quan

相关推荐

最近更新

标签

bash 多个带有 404 检测的 cURL 下载

提问by natehome

回答by denisvm

回答by Stephen Quan

相关推荐

bash sed 查找并用空格替换字符串

bash 如何在 oh my zsh 中安装 Anaconda

将 CSV 值导入 Bash

bash Vagrant Shell Provisioning 运行但失败

相关推荐

最近更新

标签