bash 如何获取目录的更新文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15040132/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to wget the more recent file of a directory
提问by ECII
I would like to write a bash script that downloads and install the latest daily build of program (RStudio). Is it possible to make wgetto download only the most recent file in the directory http://www.rstudio.org/download/daily/desktop/?
我想编写一个 bash 脚本来下载并安装最新的每日程序版本 (RStudio)。是否可以wget只下载目录http://www.rstudio.org/download/daily/desktop/中的最新文件?
回答by Richard Pump
The files seem to be sorted by the release date, with each new release being a new entry with a new name reflecting the version number change, so checking timestamps of a certain file seems unnecessary.
这些文件似乎按发布日期排序,每个新版本都是一个新条目,新名称反映了版本号的变化,因此检查某个文件的时间戳似乎没有必要。
Also, you have provided a link to a "directory", which essentially is a web page. AFAIK, there is no such thing as a directory in http (which is a communication protocol serving you data at the given address). What you see is a listing generated by the server that resembles windows folders for the ease of use, though it's still a web page.
此外,您还提供了一个指向“目录”的链接,它本质上是一个网页。AFAIK,http 中没有目录之类的东西(这是一种在给定地址为您提供数据的通信协议)。您看到的是由服务器生成的列表,它类似于 windows 文件夹以方便使用,尽管它仍然是一个网页。
Having that said, you can scrape that web page. The following code downloads the file at first position on the listing (assuming the first one is the most recent one):
话虽如此,您可以抓取该网页。以下代码在列表的第一个位置下载文件(假设第一个是最新的):
#!/bin/bash
wget -q -O tmp.html http://www.rstudio.org/download/daily/desktop/ubuntu64/
RELEASE_URL=`cat tmp.html | grep -m 1 -o -E "https[^<>]*?amd64.deb" | head -1`
rm tmp.html
# TODO Check if the old package name is the same as in RELEASE_URL.
# If not, then get the new version.
wget -q $RELEASE_URL
Now you can check it against your local most-recent version, and install if necessary.
现在,您可以根据本地最新版本进行检查,并在必要时进行安装。
EDIT: Updated version, which does simple version checking and installs the package.
编辑:更新版本,它进行简单的版本检查并安装包。
#!/bin/bash
MY_PATH=`dirname "RELEASE_URL=$(wget -q -O - http://www.rstudio.org/download/daily/desktop/ubuntu64 | grep -o -m 1 "https[^\']*" )
# check version from name ...
wget ${RELEASE_URL}
"`
RES_DIR="$MY_PATH/res"
# Piping from stdout suggested by Chirlo.
RELEASE_URL=`wget -q -O - http://www.rstudio.org/download/daily/desktop/ubuntu64/ | grep -m 1 -o "https[^\']*"`
if [ "$RELEASE_URL" == "" ]; then
echo "Package index not found. Maybe the server is down?"
exit 1
fi
mkdir -p "$RES_DIR"
NEW_PACKAGE=${RELEASE_URL##https*/}
OLD_PACKAGE=`ls "$RES_DIR"`
if [ "$OLD_PACKAGE" == "" ] || [ "$OLD_PACKAGE" != "$NEW_PACKAGE" ]; then
cd "$RES_DIR"
rm -f $OLD_PACKAGE
echo "New version found. Downloading..."
wget -q $RELEASE_URL
if [ ! -e "$NEW_PACKAGE" ]; then
echo "Package not found."
exit 1
fi
echo "Installing..."
sudo dpkg -i $NEW_PACKAGE
else
echo "rstudio up to date."
fi
And a couple of comments:
还有一些评论:
- The script keeps a local
res/dir with the latest version (exactly one file) and compares it's name with the newly scraped package name. This is dirty (having a file doesn't mean that it has been successfully installed in the past). It would be better to parse the output ofdpkg -l, but the name of the package might slightly differ from the scraped one. - You will still need to enter the
password for
sudo, so it won't be 100% automatic. There are a few ways around this, though without supervision you might encounter the previously stated problem.
- 该脚本
res/使用最新版本(正好是一个文件)保留一个本地目录,并将其名称与新抓取的包名称进行比较。这是脏的(拥有文件并不意味着它过去已成功安装)。解析 的输出会更好dpkg -l,但包的名称可能与抓取的略有不同。 - 您仍然需要输入 密码
sudo,因此它不会是 100% 自动的。有几种方法可以解决这个问题,但如果没有监督,您可能会遇到前面提到的问题。
回答by Chirlo
A slightly cleaner variation of @Richard Pumps:
@Richard Pumps 的一个更简洁的变体:
##代码##this avoids creating a tmp file by outputing the html file to stdout and filtering it.
这避免了通过将 html 文件输出到 stdout 并对其进行过滤来创建 tmp 文件。
回答by L0j1k
The -Noption will tell wget to only get a file if it's a newer version. However, using wget alone, you cannot do something as broad as downloading the newest file of all files in some remote directory. You'll need to write a bash script or something that does the checking and then calls wget to grab it.
该-N选项将告诉 wget 仅获取较新版本的文件。但是,单独使用 wget,您不能像下载某个远程目录中所有文件的最新文件那样广泛。您需要编写一个 bash 脚本或进行检查的内容,然后调用 wget 来获取它。

