bash Linux:删除不包含所有指定单词的文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/614654/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 20:43:16  来源:igfitidea点击:

Linux: Removing files that don't contain all the words specified

linuxbashfileshell

提问by Daniel

Inside a directory, how can I delete files that lack any of the words specified, so that only files that contain ALL the words are left? I tried to write a simple bash shell script using grep and rm commands, but I got lost. I am totally new to Linux, any help would be appreciated

在目录中,如何删除缺少任何指定单词的文件,以便只留下包含所有单词的文件?我尝试使用 grep 和 rm 命令编写一个简单的 bash shell 脚本,但我迷路了。我对 Linux 完全陌生,任何帮助将不胜感激

回答by toolkit

How about:

怎么样:

grep -L foo *.txt | xargs rm
grep -L bar *.txt | xargs rm

If a file does notcontain foo, then the first line will remove it.

如果文件确实包含foo,则第一行会删除它。

If a file does notcontain bar, then the second line will remove it.

如果文件确实包含bar,则第二线将其删除。

Only files containing both fooand barshould be left

只应保留包含foo和的文件bar

-L, --files-without-match
     Suppress normal output; instead print the  name  of  each  input
     file from which no output would normally have been printed.  The
     scanning will stop on the first match.

See also @Mykola Golubyev's postfor placing in a loop.

另请参阅@Mykola Golubyev 的帖子以放置在循环中。

回答by Mykola Golubyev

list=`Word1 Word2 Word3 Word4 Word5`
for word in $list
    grep -L $word *.txt | xargs rm
done

回答by soulmerge

Addition to the answers above: Use the newline character as delimiter to handle file names with spaces!

除了上面的答案:使用换行符作为分隔符来处理带空格的文件名!

grep -L $word $file | xargs -d '\n' rm

回答by user65636

grep -L word | xargs rm

grep -L 字 | xargs rm

回答by Andy

To do the same matching filenames (not the contents of files as most of the solutions above) you can use the following:

要执行相同的匹配文件名(不是上述大多数解决方案的文件内容),您可以使用以下命令:

for file in `ls --color=never | grep -ve "\(foo\|bar\)"`
do
   rm $file
done

As per comments:

根据评论:

for file in `ls`

shouldn't be used. The below does the same thing without using the ls

不应该使用。下面做同样的事情而不使用ls

for file in *
do
  if [ x`echo $file | grep -ve "\(test1\|test3\)"` == x ]; then
    rm $file
  fi
done

The -ve reverses the search for the regexp pattern for either foo or bar in the filename. Any further words to be added to the list need to be separated by \| e.g. one\|two\|three

-ve 反转搜索文件名中 foo 或 bar 的正则表达式模式。要添加到列表中的任何其他单词都需要用 \| 分隔 例如一\|二\|三

回答by Dimitre Radoulov

You could try something like this but it may break if the patterns contain shellor grepmeta characters:

您可以尝试这样的操作,但如果模式包含shellgrep元字符,它可能会中断:

(in this example one two threeare the patterns)

(在这个例子中,一二三是模式)

for f in *; do
  unset cmd
  for p in one two three; do
    cmd="fgrep \"$p\" \"$f\" && $cmd"
  done
  eval "$cmd" >/dev/null || rm "$f"  
done 

回答by paxdiablo

First, remove the file-list:

首先,删除文件列表:

rm flist

Then, for each of the words, add the file to the filelist if it contains that word:

然后,对于每个单词,如果文件包含该单词,则将文件添加到文件列表中:

grep -l WORD * >>flist

Then sort, uniqify and get a count:

然后排序,统一并得到一个计数:

sort flist | uniq -c >flist_with_count

All those files in flsit_with_count that don't have the number of words should be deleted. The format will be:

应该删除 flsit_with_count 中没有单词数的所有文件。格式将是:

2 file1
7 file2
8 file3
8 file4

If there were 8 words, then file1 and file2 should be deleted. I'll leave the writing/testing of the script to you.

如果有 8 个字,则应删除 file1 和 file2。我会把脚本的编写/测试留给你。

Okay, you convinced me, here's my script:

好的,你说服了我,这是我的脚本:

#!/bin/bash
rm -rf flist
for word in fopen fclose main ; do
    grep -l ${word} *.c >>flist
done
rm $(sort flist | uniq -c | awk ' != 3 {print } {}')

This removes the files in the directory that didn't have all three words:

这将删除目录中没有所有三个单词的文件:

回答by Eugene Morozov

This will remove all files that doesn't contain words Pingor Sent

这将删除所有不包含PingSent字样的文件

grep -L 'Ping\|Sent' * | xargs rm