bash 基于多种模式重命名文件的更好方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20629302/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Better way to rename files based on multiple patterns
提问by user3100854
a lot of files I download have crap/spam in their filenames, e.g.
我下载的很多文件的文件名中都有废话/垃圾邮件,例如
[ www.crap.com ] file.name.ext
[ www.crap.com ] file.name.ext
www.crap.com - file.name.ext
www.crap.com - file.name.ext
I've come up with two ways for dealing with them but they both seem pretty clunky:
我想出了两种处理它们的方法,但它们看起来都很笨拙:
with parameter expansion:
带参数扩展:
if [[ ${base_name} != ${base_name//\[+([^\]])\]} ]]
then
mv -v "${dir_name}/${base_name}" "${dir_name}/${base_name//\[+([^\]])\]}" &&
base_name="${base_name//\[+([^\]])\]}"
fi
if [[ ${base_name} != ${base_name//www.*.com - /} ]]
then
mv -v "${dir_name}/${base_name}" "${dir_name}/${base_name//www.*.com - /}" &&
base_name="${base_name//www.*.com - /}"
fi
# more of these type of statements; one for each type of frequently-encountered pattern
and then with echo/sed:
然后使用 echo/sed:
tmp=`echo "${base_name}" | sed -e 's/\[[^][]*\]//g' | sed -e 's/\s-\s//g'`
mv "${base_name}" "{tmp}"
I feel like the parameter expansion is the worse of the two but I like it because I'm able to keep the same variable assigned to the file for further processing after the rename (the above code is used in a script that's called for each file after the file download is complete).
我觉得参数扩展是两者中最差的,但我喜欢它,因为我能够在重命名后保留分配给文件的相同变量以供进一步处理(上面的代码用于为每个文件调用的脚本中文件下载完成后)。
So anyway I was hoping there's a better/cleaner way to do the above that someone more knowledgeable than myself could show me, preferably in a way that would allow me to easily reassign the old/original variable to the new/renamed file.
所以无论如何,我希望有一种更好/更干净的方法来执行上述操作,让比我更了解自己的人可以向我展示,最好以一种能让我轻松地将旧/原始变量重新分配给新/重命名文件的方式。
Thanks
谢谢
回答by F. Hauri
Two answer: using perlrename or using purebash
两个答案:使用perl重命名或使用纯bash
As there are some people who dislike perl, I wrote mybash only version
因为有些人不喜欢 perl,所以我写了我的bash only 版本
Renaming files by using the rename
command.
使用rename
命令重命名文件。
Introduction
介绍
Yes, this is a typical job for rename
command which was precisely designed for:
是的,这是一个典型的rename
指挥工作,专为:
man rename | sed -ne '/example/,/^[^ ]/p'
For example, to rename all files matching "*.bak" to strip the
extension, you might say
rename 's/\.bak$//' *.bak
To translate uppercase names to lower, you'd use
rename 'y/A-Z/a-z/' *
More oriented samples
更多定向样品
Simply drop all spacesand square brackets:
只需删除所有空格和方括号:
rename 's/[ \[\]]*//g;' *.ext
Rename all .jpg
by numbering from 1
:
.jpg
通过编号重命名所有1
:
rename 's/^.*$/sprintf "IMG_%05d.JPG",++$./e' *.jpg
Demo:
演示:
touch {a..e}.jpg
ls -ltr
total 0
-rw-r--r-- 1 user user 0 sep 6 16:35 e.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 d.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 c.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 b.jpg
-rw-r--r-- 1 user user 0 sep 6 16:35 a.jpg
rename 's/^.*$/sprintf "IMG_%05d.JPG",++$./e' *.jpg
ls -ltr
total 0
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00005.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00004.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00003.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00002.JPG
-rw-r--r-- 1 user user 0 sep 6 16:35 IMG_00001.JPG
Full syntax for matching SO question, in safe way
以安全的方式匹配 SO 问题的完整语法
There is a strong and safeway using rename
utility:
使用实用程序有一种强大而安全的方法rename
:
As this is perlcommon tool, we have to use perl syntax:
由于这是perl常用工具,我们必须使用 perl 语法:
rename 'my $o=$_;
s/[ \[\]]+/-/g;
s/-+/-/g;
s/^-//g;
s/-\(\..*\|\)$//g;
s/(.*[^\d])(|-(\d+))(\.[a-z0-9]{2,6})$/
my $i=;
$i=0 unless $i;
sprintf("%s-%d%s", , $i+1, )
/eg while
$o ne $_ &&
-f $_;
' *
Testing rule:
测试规则:
touch '[ www.crap.com ] file.name.ext' 'www.crap.com - file.name.ext'
ls -1
[ www.crap.com ] file.name.ext
www.crap.com - file.name.ext
rename 'my $o=$_; ...
...
...' *
ls -1
www.crap.com-file.name-1.ext
www.crap.com-file.name.ext
touch '[ www.crap.com ] file.name.ext' 'www.crap.com - file.name.ext'
ls -1
www.crap.com-file.name-1.ext
[ www.crap.com ] file.name.ext
www.crap.com - file.name.ext
www.crap.com-file.name.ext
rename 'my $o=$_; ...
...
...' *
ls -1
www.crap.com-file.name-1.ext
www.crap.com-file.name-2.ext
www.crap.com-file.name-3.ext
www.crap.com-file.name.ext
... and so on...
... 等等...
... and it's safe while you don't use -f
flag to rename
command: file won't be overwrited and you will get an error message if something goes wrong.
...当您不使用-f
标志来rename
命令时它是安全的:文件不会被覆盖,如果出现问题,您将收到一条错误消息。
Renaming files by using bashand so called bashisms:
使用bash和所谓的bashisms重命名文件:
I prefer doing this by using dedicated utility, but this could even be done by using purebash(aka without any fork)
我更喜欢使用专用实用程序来做到这一点,但这甚至可以通过使用纯bash(也就是没有任何叉子)来完成
There is no use of any other binary than bash (no sed
, awk
, tr
or other):
除了 bash(没有、或其他)之外sed
,没有使用任何其他二进制文件:awk
tr
#!/bin/bash
for file;do
newname=${file//[ \]\[]/.}
while [ "$newname" != "${newname#.}" ] ;do
newname=${newname#.}
done
while [ "$newname" != "${newname//[.-][.-]/.}" ] ;do
newname=${newname//[.-][.-]/-};done
if [ "$file" != "$newname" ] ;then
if [ -f $newname ] ;then
ext=${newname##*.}
basename=${newname%.$ext}
partname=${basename%%-[0-9]}
count=${basename#${partname}-}
[ "$partname" = "$count" ] && count=0
while printf -v newname "%s-%d.%s" $partname $[++count] $ext &&
[ -f "$newname" ] ;do
:;done
fi
mv "$file" $newname
fi
done
To be run with files as argument, for sample:
以文件作为参数运行,例如:
/path/to/my/script.sh \[*
- Replacing spaces and square bracket by dot
- Replacing sequences of
.-
,-.
,--
or..
by only one-
. - Test if filename don't differ, there is nothing to do.
- Test if a file exist with newname...
- split filename, counter and extension, for making indexed newname
- loop if a file exist with newname
- Finaly rename the file.
- 用点替换空格和方括号
- 更换的序列
.-
,-.
,--
或..
仅由一个-
。 - 测试文件名是否不同,没有什么可做的。
- 测试如果一个文件存在NEWNAME...
- 拆分文件名,计数器和扩展,为使索引NEWNAME
- 循环,如果有文件存在NEWNAME
- 最后重命名文件。
回答by Michael Le Barbier Grünewald
Take advantage of the following classical pattern:
利用以下经典模式:
job_select /path/to/directory| job_strategy | job_process
where job_select
is responsible for selecting the objects of your job, job_strategy
prepares a processing plan for these objects and job_process
eventually executes the plan.
wherejob_select
负责选择你的工作对象,job_strategy
为这些对象准备一个处理计划并job_process
最终执行该计划。
This assumes that filenames do not contain a vertical bar |
nor a newline character.
这假设文件名不包含竖线|
或换行符。
The job_select function
job_select 函数
# job_select PATH
# Produce the list of files to process
job_select()
{
find "" -name 'www.*.com - *' -o -name '[*] - *'
}
The find
command can examine all properties of the file maintained by the file system, like creation time, access time, modification time. It is also possible to control how the filesystem is explored by telling find
not to descend into mounted filesystems, how much recursions levels are allowed. It is common to append pipes to the find
command to perform more complicated selections based on the filename.
该find
命令可以检查文件系统维护的文件的所有属性,如创建时间、访问时间、修改时间。还可以通过告诉find
不要进入已安装的文件系统,允许多少递归级别来控制文件系统的探索方式。将管道附加到find
命令以根据文件名执行更复杂的选择是很常见的。
Avoid the common pitfallof including the contents of hidden directories in the output of the job_select
function. For instance, the directories CVS
, .svn
, .svk
and .git
are used by the corresponding source control management tools and it is almost always wrong to include their contents in the output of the job_select
function. By inadvertently batch processing these files, one can easily make the affected working copy unusable.
避免在job_select
函数的输出中包含隐藏目录的内容的常见陷阱。例如,目录CVS
,.svn
,.svk
并.git
使用由相应的源代码控制管理工具,它几乎总是错的,包括在输出的内容job_select
的功能。通过不经意地批处理这些文件,很容易使受影响的工作副本无法使用。
The job_strategy function
job_strategy 函数
# job_strategy
# Prepare a plan for renaming files
job_strategy()
{
sed -e '
h
s@/www\..*\.com - *@/@
s@/\[^]]* - *@/@
x
G
s/\n/|/
'
}
This commands reads the output of job_select
and makes a plan for our renaming job. The plan is represented by text lines having two fields separated by the character |
, the first field being the old name of the file and the second being the new computed file of the file, it looks like
此命令读取 的输出job_select
并为我们的重命名作业制定计划。该计划由具有两个字段的文本行表示,由字符分隔|
,第一个字段是文件的旧名称,第二个字段是文件的新计算文件,看起来像
[ www.crap.com ] file.name.1.ext|file.name.1.ext
www.crap.com - file.name.2.ext|file.name.2.ext
The particular program used to produce the plan is essentially irrelevant, but it is common to use sed
as in the example; awk
or perl
for this. Let us walk through the sed
-script used here:
用于生成计划的特定程序本质上是无关紧要的,但sed
在示例中使用是很常见的;awk
或perl
为此。让我们来看看sed
这里使用的-script:
h Replace the contents of the hold space with the contents of the pattern space.
… Edit the contents of the pattern space.
x Swap the contents of the pattern and hold spaces.
G Append a newline character followed by the contents of the hold space to the pattern space.
s/\n/|/ Replace the newline character in the pattern space by a vertical bar.
It can be easier to use several filters to prepare the plan. Another common case is the use of the stat
command to add creation times to file names.
使用多个过滤器来准备计划会更容易。另一种常见情况是使用该stat
命令将创建时间添加到文件名中。
The job_process function
job_process 函数
# job_process
# Rename files according to a plan
job_process()
{
local oldname
local newname
while IFS='|' read oldname newname; do
mv "$oldname" "$newname"
done
}
The input field separatorIFS is adjusted to let the function read the output of job_strategy
. Declaring oldname
and newname
as local is useful in large programs but can be omitted in very simple scripts. The job_process
function can be adjusted to avoid overwriting existing files and report the problematic items.
所述输入字段分隔符IFS被调整为让读功能的输出job_strategy
。声明oldname
and newname
as local 在大型程序中很有用,但在非常简单的脚本中可以省略。job_process
可以调整该功能以避免覆盖现有文件并报告有问题的项目。
About data structures in shell programsNote the use of pipes to transfer data from one stage to the other: apprentices often rely on variables to represent such information but it turns out to be a clumsy choice. Instead, it is preferable to represent data as tabular files or as tabular data streams moving from one process to the other, in this form, data can be easily processed by powerful tools like sed
, awk
, join
, paste
and sort
— only to cite the most common ones.
关于 shell 程序中的数据结构请注意使用管道将数据从一个阶段传输到另一个阶段:学徒通常依赖变量来表示此类信息,但结果证明这是一个笨拙的选择。相反,优选的是表示数据作为表格文件或作为表格数据流从一个进程移动到另一个,在这种形式中,数据可以很容易地通过有力工具等加工sed
,awk
,join
,paste
和sort
-只是举最常见的。
回答by Jahid
You can use rnm
您可以使用rnm
rnm -rs '/\[crap\]|\[spam\]//g' *.ext
The above will remove [crap]
or [spam]
from filename.
以上将从文件名中删除[crap]
或[spam]
。
You can pass multiple regex pattern by terminating them with ;
or overloading the -rs
option.
您可以通过使用选项终止;
或重载-rs
选项来传递多个正则表达式模式。
rnm -rs '/[\[\]]//g;/\s*\[crap\]//g' -rs '/crap2//' *.ext
The general format of this replace string is /search_part/replace_part/modifier
这个替换字符串的一般格式是 /search_part/replace_part/modifier
- search_part: regex to search for.
- replace_part: string to replace with
- modifier: i (case insensitive), g (global replace)
- search_part:要搜索的正则表达式。
- replace_part: 要替换的字符串
- 修饰符:i(不区分大小写),g(全局替换)
uppercase/lowercase:
大写小写:
A replace string of the form /search_part/\c/modifier
will make the selected part of the filename (by the regex search_part
) lowercase while \C
(capital \C) in replace part will make it uppercase.
表单的替换字符串/search_part/\c/modifier
将使文件名的选定部分(通过正则表达式search_part
)小写,而\C
替换部分中的(大写 \C)将使其变为大写。
rnm -rs '/[abcd]/\C/g' *.ext
## this will capitalize all a,b,c,d in the filenames
如果您有许多需要处理的正则表达式模式,则将这些模式放在一个文件中并使用
-rs/f
-rs/f
选项传递该文件。rnm -rs/f /path/to/regex/pattern/file *.ext
You can find some other examples here.
您可以在此处找到其他一些示例。
Note:
笔记:
- rnm uses PCRE2 (revised PCRE) regex.
- You can undo an unwanted rename operation by running
rnm -u
- rnm 使用 PCRE2(修订版 PCRE)正则表达式。
- 您可以通过运行来撤消不需要的重命名操作
rnm -u
P.S: I am the author of this tool.
PS:我是这个工具的作者。
回答by Stefano Falsetto
If you want to use something not depending on perl, you can use the following code (let's call it sanitizeNames.sh
). It is only showing a few cases, but it's easily extensible using string substitution, tr (and sed too).
如果你想使用不依赖于 perl 的东西,你可以使用下面的代码(我们称之为sanitizeNames.sh
)。它只显示了几种情况,但可以使用字符串替换、tr(以及 sed)轻松扩展。
#!/bin/bash
ls |while read f; do
newfname=$(echo "$f" \
|tr -d '\[ ' \ # Removing opened square bracket
|tr ' \]' '-' \ # Translating closing square bracket to dash
|tr -s '-' \ # Squeezing multiple dashes
|tr -s '.' \ # Squeezing multiple dots
)
newfname=${newfname//-./.}
if [ -f "$newfname" ]; then
# Some string magic...
extension=${newfname##*\.}
basename=${newfname%\.*}
basename=${basename%\-[1-9]*}
lastNum=$[ $(ls $basename*|wc -l) ]
mv "$f" "$basename-$lastNum.$extension"
else
mv "$f" "$newfname"
fi
done
And use it:
并使用它:
$ touch '[ www.crap.com ] file.name.ext' 'www.crap.com - file.name.ext' '[ www.crap.com ] - file.name.ext' '[www.crap.com ].file.anothername.ext2' '[www.crap.com ].file.name.ext'
$ ls -1 *crap*
[ www.crap.com ] - file.name.ext
[ www.crap.com ] file.name.ext
[www.crap.com ].file.anothername.ext2
[www.crap.com ].file.name.ext
www.crap.com - file.name.ext
$ ./sanitizeNames.sh *crap*
$ ls -1 *crap*
www.crap.com-file.anothername.ext2
www.crap.com-file.name-1.ext
www.crap.com-file.name-2.ext
www.crap.com-file.name-3.ext
www.crap.com-file.name.ext
回答by Sandeep
If you are using Ubunntu/Debian os use rename command to rename multiple files at time.
如果您使用的是 Ubunntu/Debian 操作系统,请使用 rename 命令一次重命名多个文件。