Linux bash中的shell脚本从ftp服务器下载文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10099540/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:42:27  来源:igfitidea点击:

shell script in bash to download file from ftp server

linuxbashshellftpdownload

提问by puneet

i have to write a shell script for bash shell to transfer file from ftp server given
ftp server -- [email protected]
user user1
password pass1

我必须为 bash shell 编写一个 shell 脚本,以便从给定
ftp 服务器的ftp 服务器传输文件 ——[email protected]
用户 user1
密码 pass1

now in /dir1/dir2 at ftp server i have folder in following forms
0.7.1.70
0.7.1.71
0.7.1.72

现在在 ftp 服务器的 /dir1/dir2 中,我有以下形式的文件夹
0.7.1.70
0.7.1.71
0.7.1.72

i have to copy file "file1.iso" from the latest folder i.e 0.7.1.72 in this case. i have to also check integrity of the file while copying i.e suppose the file are being uploaded to the server and at that time if i start copying in this case copying will not be complete.

在这种情况下,我必须从最新的文件夹(即 0.7.1.72)中复制文件“file1.iso”。我还必须在复制时检查文件的完整性,即假设文件正在上传到服务器,那时如果我开始复制,在这种情况下复制将不完整。

i have to do it after every 4 hour . this can be done by making it a cron job. please help

我必须每 4 小时后做一次。这可以通过使其成为 cron 工作来完成。请帮忙

i have done this i mounted the ftp server folder to my local machine . for checking if the file has been completely uploaded or not i am checking the size after every 50 sec for 5 times if it is same then i am copying it otherwise run the script after 4 hr... i have maintained a text file " foldernames.txt" which have name of all those folders from which i have copied the required file .. so i am checking if a new folder is added at server by checking its name in the foldername.text file .. **

我已经这样做了,我将 ftp 服务器文件夹安装到了我的本地机器上。为了检查文件是否已完全上传,我每 50 秒检查 5 次大小,如果相同,则复制它,否则在 4 小时后运行脚本...我维护了一个文本文件“文件夹名称.txt”,其中包含我从中复制了所需文件的所有文件夹的名称..所以我正在检查是否在服务器上添加了一个新文件夹,方法是在 foldername.text 文件中检查其名称.. **

every thing is working fine only problem now is .. suppose file was being downloaded an at that time there was some network failure.. how will i make sure that i have completely downloaded the file .... i tried to use md5sum and chksum but it was taking to long to compute on mounted folder. please help

一切正常,现在唯一的问题是.. 假设正在下载文件,当时出现了一些网络故障.. 我将如何确保我已完全下载文件.. 我尝试使用 md5sum 和 chksum但是在安装的文件夹上计算需要很长时间。请帮忙

here is my script ..

这是我的脚本..

#!/bin/bash
#
# changing the directory to source location 
echo " ########### " >> /tempdir/pvmscript/scriptlog.log
echo `date`>> /tempdir/pvmscript/scriptlog.log
echo " script is strting " >> /tempdir/pvmscript/scriptlog.log
cd /var/mountpt/pvm-vmware
#
# array to hold the name of last five folders of the source location
declare -a arr
i=0
for folder in `ls -1 | tail -5 `; do
arr[i]=$folder
#echo $folder
i=$((i+1))
done
echo " array initialised " >> /tempdir/pvmscript/scriptlog.log
#
#now for these 5 folders we will check if their name is present in the list of copied         
#  folder names
#
echo " checking for the folder name in list " >> /tempdir/pvmscript/scriptlog.log
## $(seq $((i-1)) -1 0 
for j in $(seq $((i-1)) -1 0  ) ; do
var3=${arr[$j]}
#var4=${var3//./}
echo " ----------------------------------------" >>  /tempdir/pvmscript/scriptlog.log
echo " the folder name is $var3" >> /tempdir/pvmscript/scriptlog.log
#
# checking if the folder name is present in the stored list of folder names or not
#
#
foldercheck=$(grep $var3 /tempdir/pvmscript/foldernames.txt | wc -l)
#
if test $foldercheck -eq 1
then 
echo " the folder $var3 is present in the list so will not copy it " >>  /tempdir/pvmscript/scriptlog.log
foldercheck=" "
continue
else
#
echo " folder $var3 is not present in the list so checking if it has the debug.iso file ">> /tempdir/pvmscript/scriptlog.log
#enter inside  the new folder in source
#
cd  /var/mountpt/pvm-vmware/$var3
#
# writing the names of content of folder to a temporary text file
#
ls -1 > /var/temporary.txt
#checking if the debug.iso is present in the given folder
var5=$(grep debug.iso /var/temporary.txt | wc -l)
var6=$(grep debug.iso //var/temporary.txt)
#
check1="true"
#
# if the file is present then checking if it is completely uploaded or not  
#
rm -f /var/temporary.txt
if test $var5 -eq 1 
then 
echo " it has the debug.iso checking if upload is complete   ">>/tempdir/pvmscript/scriptlog.log
#
# getting the size of the file we are checking if size of the file is constant or     changing    # after regular interval
#
var7=$(du -s ./$var6 |cut -f 1 -d '.')
#echo " size of the file is $var7"
sleep 50s
#
# checking for 5 times at a regular interval of 50 sec if size changing or not 
#
#
for x in 1 2 3 4 5 ;do
var8=$(du -s ./$var6 |cut -f 1 -d '.')
#
#if size is changing exit and check it after 4 hrs when the script will rerun
#echo " size of the file $x is $var7"
if test $var7 -ne $var8
then
check1="false"
echo " file is still in the prossess of being uploadig so exiting will check after 4 hr  " >> /tempdir/pvmscript/scriptlog.log
break
fi
sleep 50s
done
#
#if the size was constant copy the file to destination
#
if test $check1 = "true" 
then
echo " upload was complete so copying the debug.iso file  " >>  /tempdir/pvmscript/scriptlog.log
cp $var6 /tempdir/PVM_Builds/ 
echo " writing the folder name to the list of folders which we have copied " >>  /tempdir/pvmscript/scriptlog.log
echo $var3 >> /tempdir/pvmscript/foldernames.txt
echo " copying is complete  " >> /tempdir/pvmscript/scriptlog.log
fi
#else 
#echo $foldercheck >> /vmfs/volumes/Storage1/PVM_Builds/foldernames.txt
else
echo " it do not have the debug.iso file so leaving the directory "  >>/tempdir/pvmscript/scriptlog.log
echo $var3 >> /tempdir/pvmscript/foldernames.txt
echo 
fi
#rm -f /var/temporary.txt
fi
done

回答by Jim Garrison

The FTP protocol is not robust enough. It does not deal with atomicity and there's no way to know if a file is still being uploaded while you download it. If you need this functionality you need to investigate using rsyncfor both downloading AND uploading.

FTP 协议不够健壮。它不涉及原子性,并且无法知道在下载文件时是否仍在上传文件。如果您需要此功能,您需要调查rsync用于下载和上传。

回答by glenn Hymanman

#!/bin/sh
if mkdir /tmp/download_in_process 2>/dev/null; then
    echo "cannot start, download in process"
    exit 1
fi

latest=$(ftp hostname << END1 | tail -1
user user1 pass1
cd /dir1/dir2
ls
END1
)

ftp hostname << END2
user user1 pass1
cd /dir1/dir2/$latest
get file1.iso
END2

rmdir /tmp/download_in_process

回答by shellter

Some comments and request for clarifications here, see below the break for one possible answer.

此处有一些评论和澄清请求,请参阅下面的休息时间以获得一个可能的答案。

(Nice job updating your question.)

(干得好,更新了你的问题。)

How big are these files?

这些文件有多大?

Are these files that you have any control over the start-time for their creation (database backups,for example).

您是否可以控制这些文件的创建开始时间(例如,数据库备份)。

It would also help to have a few more details these files, i.e. size, MB, GB, TB, PB? and the source that creates them, db-backup, or ???.

了解这些文件的更多细节也会有所帮助,即大小、MB、GB、TB、PB?以及创建它们的源,db-backup 或 ???。

Are your concerns theoretical, proactive explorations for worst-case-scenarios, or if you have real problems, how often and what are the consequences?

您的担忧是针对最坏情况的理论性主动探索,还是如果您遇到真正的问题,多久会发生一次,后果是什么?

Is your SLA an unrealistic/unattainable management pipe dream? If so then you have to start creating documentation to show that the current system will require X amount of additional resources (people, hardware, programming,etc) to correct deficiencies in your system.

您的 SLA 是一个不切实际/无法实现的管理白日梦吗?如果是这样,那么您必须开始创建文档以表明当前系统将需要 X 数量的额外资源(人员、硬件、编程等)来纠正系统中的缺陷。



If the files being transfered are datafiles created by a source system, one technique is to have the source system create a 'flag' file that is sent afterthe main file is sent.

如果传输的文件是源系统创建的数据文件,一种技术是让源系统创建一个“标志”文件,在主文件发送发送。

It could contain details like

它可能包含详细信息,例如

  filename : TradeData_2012-04-13.dat
  recCount : 777777
  fileSize : 37604730291
  workOfDate: 2012-04-12
  md5sum    : ....

So, now your systems waits to find that the flag file has been delivered, becuase you're using a standard naming convention for each file that you receive, and you use a stand date-stamp embedded in the file. When the file arrives, your script calculates each relevant detail and compares them to the values stored in the flag file.

因此,现在您的系统正在等待发现标志文件已发送,因为您对收到的每个文件使用标准命名约定,并且您使用嵌入在文件中的标准日期戳。当文件到达时,您的脚本会计算每个相关细节并将它们与存储在标志文件中的值进行比较。

If you can't arrange this level of detail, at least generic flag file, per day-per file, OR per daily batch of files (sent when all files are done) could be followed with tests that compare the new files against a set of tests that makes sense for your particular situation, ... some of the following:

如果您不能安排这种级别的详细信息,至少可以对通用标志文件、每天每个文件或每天一批文件(在所有文件完成后发送)进行测试,将新文件与一组文件进行比较对您的特定情况有意义的测试,...以下一些:

  • file must be at least X big
  • file must be at least N records
  • file can never be smaller than yesterdays file
  • etc
  • 文件必须至少 X 大
  • 文件必须至少有 N 条记录
  • 文件永远不会小于昨天的文件
  • 等等

Then your defense is "we don't have complete control over the files, but we checked them for X,Y,Z and it passed those tests, that is why we loaded them".

那么你的辩护是“我们无法完全控制文件,但我们检查了它们的 X、Y、Z 并通过了这些测试,这就是我们加载它们的原因”。



While rsynccould be good, I don't see how, given some of the scenarios mentioned, you'd ever be sure that it was safe to start loading the file, as rsyncmight start adding more data to the file.

虽然rsync可能很好,但我不明白,鉴于提到的一些场景,您永远可以确定开始加载文件是安全的,因为rsync可能会开始向文件添加更多数据。



Reading through your script, if you can't get a detailed flag file from your source, you're on the right track. Glenn Hymanman's solution looks to accomplish the same goal with less code. You could put that inside a scriptFile 'getRemotedata.sh' or similar, and put it in a while loop that only exits when the 'getRemotedata.sh' exits with success. I guess I would want some type of notification that it is has spent 3*normalTime running. But it can get very complex when you try to cover all conditions. There are 3rd party tools that can manage file downloads, but we never had the budget to buy them, so I can't recommend any.

通读您的脚本,如果您无法从源代码中获得详细的标志文件,那么您就走在正确的轨道上。Glenn Hymanman 的解决方案旨在以更少的代码实现相同的目标。您可以将它放在 scriptFile 'getRemotedata.sh' 或类似文件中,并将其放入一个 while 循环中,该循环仅在 'getRemotedata.sh' 成功退出时退出。我想我想要某种类型的通知,它已经花费了 3*normalTime 运行时间。但是当您尝试涵盖所有条件时,它会变得非常复杂。有可以管理文件下载的 3rd 方工具,但我们从来没有购买它们的预算,所以我不能推荐任何。

whew

I hope this helps.

我希望这有帮助。



P.S. Welcome to StackOverflow (S.O.) Please remeber to read the FAQs, http://tinyurl.com/2vycnvr, vote for good Q/A by using the gray triangles, http://i.imgur.com/kygEP.png, and to accept the answer that bes solves your problem, if any, by pressing the checkmark sign , http://i.imgur.com/uqJeW.png

PS 欢迎使用 StackOverflow (SO) 请记住阅读常见问题解答,http://tinyurl.com/2vycnvr,使用灰色三角形为好的 Q/A 投票, http://i.imgur.com/kygEP.png,并接受可以解决您的问题的答案(如果有),请按复选标记,http://i.imgur.com/uqJeW.png