bash 用于监视文件夹的bash脚本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1769034/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 21:25:07  来源:igfitidea点击:

bash script to watch a folder

linuxbash

提问by prabhu

I have the following situation:

我有以下情况:

There is a windows folder that has been mounted on a Linux machine. There could be multiple folders (setup before hand) in this windows mount. I have to do something (preferably a script to start with) to watch these folders.

在 Linux 机器上安装了一个 windows 文件夹。此 Windows 安装中可能有多个文件夹(事先设置)。我必须做一些事情(最好是从脚本开始)来查看这些文件夹。

These are the steps: Watch for any incoming file(s). Make sure they are transferred completely. Move it to another folder. I do not have any control over the file transfer program on the windows machine. It is a secure FTP I believe. So I cannot ask that process to send me a trailer file to ensure the completion of file transfer.

这些是步骤: 注意任何传入的文件。确保它们完全转移。将其移动到另一个文件夹。我对 Windows 机器上的文件传输程序没有任何控制权。我相信这是一个安全的 FTP。因此,我不能要求该进程向我发送预告文件以确保文件传输完成。

I have written a bash script. I would like to know about any potential pitfalls with this approach. Reason is, there is a possibility of mulitple copies of this script running for multiple directories like this.

我写了一个 bash 脚本。我想知道这种方法的任何潜在缺陷。原因是,有可能为这样的多个目录运行此脚本的多个副本。

At the moment, there could be upto 100 directories that may have to be monitored.

目前,可能需要监控多达 100 个目录。

Following is the script. I'm sorry for pasting a very long one here. Please take your time to review it and comment / criticize it. :-)

以下是脚本。很抱歉在这里粘贴了很长的一段。请花点时间查看并评论/批评它。:-)

It takes 3 parameters, the folder that has to be watched, the folder where the file has to be moved, and a time interval, which has been explained below.

它需要 3 个参数,必须观看的文件夹、必须移动文件的文件夹和时间间隔,这在下面已经解释过。

I'm sorry there seems to be a problem with the alignment. Markdown doesn't seem to like it. I tried to organize it properly, but not able to do so.

很抱歉,对齐似乎有问题。Markdown 似乎不喜欢它。我试图正确组织它,但无法这样做。

Linux servername 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

Linux servername 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux

#!/bin/bash
log_this()
{
    message=""
    now=`date "+%D-%T"`
    echo $$": "$now ": " $message
}
usage()
{
    cat << EOF
Usage: 
logOfChanges="/tmp/changes.log.csv" # Set your file name here.

# Lock and load
inotifywait -mrcq $DIR > "$logOfChanges" & # monitor, recursively, output CSV, be quiet.
IN_PID=$$

# Do your stuff here
...

# Kill and analyze
kill $IN_PID
cat "$logOfChanges" | while read entry; do
   # Split your CSV, but beware that file names may contain spaces too.
   # Just look up how to parse CSV with bash. :)
   path=... 
   event=...
   ...  # Other stuff like time stamps
   # Depending on the event…
   case "$event" in
     SOME_EVENT) myHandlingCode path ;;
     ...
     *) myDefaultHandlingCode path ;;
done
<Directory to be watched> <Directory to transfer> <time interval> Time interval is the amount of time after which the modification time of a file will be monitored. EOF `exit 1` } if [ $# -lt 2 ] then usage fi WATCH_DIR= APP_DIR= if [ ! -d "$WATCH_DIR" ] then log_this "FATAL: WATCH_DIR, $WATCH_DIR does not exist. Exiting" exit 1 fi if [ ! -d "$APP_DIR" ] then log_this "APP_DIR: $APP_DIR does not exist. Exiting" exit 1 fi # This needs to be set after considering the rate of file transfer. # Represents the seconds elapsed after the last modification to the file. # If not supplied as parameter, defaults to 3. seconds_between_mods= if ! [[ "$seconds_between_mods" =~ ^[0-9]+$ ]]; then if [ ${#seconds_between_mods} -eq 0 ]; then log_this "No value supplied for elapse time. Defaulting to 3." seconds_between_mods=3 else log_this "Invalid value provided for elapse time" exit 1 fi fi log_this "Start Monitor." while true do ls -1 $WATCH_DIR | while read file_name do log_this "Start Monitoring for $file_name" # Refer only the modification with reference to the mount folder. # If there is a diff in time between servers, we are in trouble. token_file=$WATCH_DIR/foo.$$ current_time=`touch $token_file && stat -c "%Y" $token_file` rm -f $token_file 2>/dev/null log_this "Current Time: $current_time" last_mod_time=`stat -c "%Y" $WATCH_DIR/$file_name` elapsed_time=`expr $current_time - $last_mod_time` log_this "Elapsed time ==> $elapsed_time" if [ $elapsed_time -ge $seconds_between_mods ] then log_this "Moving $file_name to $APP_DIR" # In case if there is no space left on the target mount, hide the file # in the mount itself and remove the incomplete file from APP_DIR. mv $WATCH_DIR/$file_name $APP_DIR if [ $? -ne 0 ] then log_this "FATAL: mv failed!! Hiding $file_name" rm $APP_DIR/$file_name mv $WATCH_DIR/$file_name $WATCH_DIR/.$file_name log_this "Removed $APP_DIR/$file_name. Look for $WATCH_DIR/.$file_name and submit later." fi log_this "End Monitoring for $file_name" else log_this "$file_name: Transfer seems to be in progress" fi done log_this "Nothing more to monitor." echo sleep 5 done

回答by Aaron Digulla

This isn't going to work for any length of time. In production, you will have network problems and other errors which can leave a partial file in the upload directory. I also don't like the idea of a "trailer" file. The usual approach is to upload the file under a temporary name and then rename it after the upload completes.

这不会在任何时间内起作用。在生产中,您会遇到网络问题和其他错误,这些错误可能会在上传目录中留下部分文件。我也不喜欢“预告片”文件的想法。通常的方法是以临时名称上传文件,然后在上传完成后重命名。

This way, you just have to list the directory, filter the temporary names out and and if there is anything left, use it.

这样,您只需要列出目录,过滤掉临时名称,如果还有剩余,就使用它。

If you can't make this change, then ask your boss for a written permission to implement something which can lead to arbitrary data corruption. This is for two purposes: 1) To make them understand that this is a real problem and not something which you make up and 2) to protect yourself when it breaks ... because it will and guess who'll get all the blame?

如果您无法进行此更改,请向您的老板寻求书面许可,以实施可能导致任意数据损坏的内容。这是出于两个目的:1)让他们明白这是一个真正的问题,而不是你自己编造的问题 2)在它破裂时保护自己......因为它会并且猜猜谁会得到所有的责备?

回答by lorenzog

I believe a much saner approach would be the use of a kernel-level filesystem notify item. Such as inotify. Get also the tools here.

我相信更明智的方法是使用内核级文件系统通知项。比如inotify。还可以在此处获取工具。

回答by Janus Troelsen

incronis an "inotify cron" system. It consists of a daemon and a table manipulator. You can use it a similar way as the regular cron. The difference is that the inotify cron handles filesystem events rather than time periods.

incron是一个“inotify cron”系统。它由一个守护进程和一个表操纵器组成。您可以以与常规 cron 类似的方式使用它。不同之处在于 inotify cron 处理文件系统事件而不是时间段。

回答by Evi1M4chine

First make sure inotify-toolsin installed.

首先确保inotify-tools在安装。

Then use them like this:

然后像这样使用它们:

import time, os, sys

#analyze() takes in a path and moves into the output_files folder, to then analyze files

def analyze(path):
    list_outputfiles = os.listdir(path + "/output_files")
    print list_outputfiles
    for i in range(len(list_outputfiles)):
        #print list_outputfiles[i]
        f = open(list_outputfiles[i], 'r')
        f.readlines()

#txtmaker reads the media file and writes its binary contents to a text file.

def txtmaker(c_file): 
    print c_file
    os.system("cat" + " " + c_file + ">" + " " + c_file +".txt")
    os.system("mv *.txt output_files")

#parser() takes in the inputed path, reads and lists all files, creates a directory, then calls txtmaker.

def parser(path):
    os.chdir(path)
    os.mkdir(path + "/output_files", 0777)
    list_files = os.listdir(path)
    for i in range(len(list_files)):
        if os.path.isdir(list_files[i]) == True:
            print (list_files[i], "is a directory")
        else:
            txtmaker(list_files[i])
    analyze(path)

def main():
    path = raw_input("Enter the full path to the media: ")
    parser(path)


if __name__ == '__main__':

    main()

Alternatively, using --formatinstead of -con inotifywaitwould be an idea.

或者,使用--format而不是-coninotifywait将是一个想法。

Just man inotifywaitand man inotifywatchfor more infos.

只是man inotifywaitman inotifywatch更多的相关信息。

回答by Recursion

To be honest a python app set up to run at start-up will do this quickly and efficiently. Python has amazing OS support and its rather complete.

老实说,设置为在启动时运行的 Python 应用程序将快速有效地执行此操作。Python 具有惊人的操作系统支持,并且相当完整。

Running the script will likely work, but it will be troublesome to take care and manage. I take it you will run these as frequent cron jobs?

运行脚本可能会奏效,但照顾和管理会很麻烦。我认为您会将这些作为频繁的 cron 作业运行吗?

回答by Recursion

To get you off your feet here is a small app I wrote which takes a path and looks at the binary output of jpeg files. I never quite finished it, but it will get you started and to see the structure of python as well as some use of os..

为了让您摆脱困境,我编写了一个小应用程序,它采用路径并查看 jpeg 文件的二进制输出。我从来没有完全完成它,但它会让你开始并看到 python 的结构以及一些使用 os..

I wouldnt spend to much time worrying about my code.

我不会花太多时间担心我的代码。

##代码##