string 在 Bash 中提取文件名和扩展名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/965053/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract filename and extension in Bash
提问by ibz
I want to get the filename (without extension) and the extension separately.
我想分别获取文件名(不带扩展名)和扩展名。
The best solution I found so far is:
到目前为止,我找到的最佳解决方案是:
NAME=`echo "$FILE" | cut -d'.' -f1`
EXTENSION=`echo "$FILE" | cut -d'.' -f2`
This is wrong because it doesn't work if the file name contains multiple .
characters. If, let's say, I have a.b.js
, it will consider a
and b.js
, instead of a.b
and js
.
这是错误的,因为如果文件名包含多个.
字符,则它不起作用。如果,比方说,我有a.b.js
,它会考虑a
and b.js
,而不是a.b
and js
。
It can be easily done in Python with
它可以在 Python 中轻松完成
file, ext = os.path.splitext(path)
but I'd prefer not to fire up a Python interpreter just for this, if possible.
但如果可能的话,我不想为此启动 Python 解释器。
Any better ideas?
有什么更好的想法吗?
回答by Petesh
First, get file name without the path:
首先,获取不带路径的文件名:
filename=$(basename -- "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"
Alternatively, you can focus on the last '/' of the path instead of the '.' which should work even if you have unpredictable file extensions:
或者,您可以关注路径的最后一个 '/' 而不是 '.' 即使您有不可预测的文件扩展名,这也应该有效:
filename="${fullfile##*/}"
You may want to check the documentation :
您可能需要查看文档:
- On the web at section "3.5.3 Shell Parameter Expansion"
- In the bash manpage at section called "Parameter Expansion"
- 在网络上的“ 3.5.3 Shell 参数扩展”部分
- 在名为“参数扩展”部分的 bash 联机帮助页中
回答by Juliano
~% FILE="example.tar.gz"
~% echo "${FILE%%.*}"
example
~% echo "${FILE%.*}"
example.tar
~% echo "${FILE#*.}"
tar.gz
~% echo "${FILE##*.}"
gz
For more details, see shell parameter expansionin the Bash manual.
有关更多详细信息,请参阅Bash 手册中的shell 参数扩展。
回答by Tomi Po
Usually you already know the extension, so you might wish to use:
通常您已经知道扩展名,因此您可能希望使用:
basename filename .extension
for example:
例如:
basename /path/to/dir/filename.txt .txt
and we get
我们得到
filename
回答by sotapme
You can use the magic of POSIX parameter expansion:
您可以使用 POSIX 参数扩展的魔力:
bash-3.2$ FILENAME=somefile.tar.gz
bash-3.2$ echo "${FILENAME%%.*}"
somefile
bash-3.2$ echo "${FILENAME%.*}"
somefile.tar
There's a caveat in that if your filename was of the form ./somefile.tar.gz
then echo ${FILENAME%%.*}
would greedily remove the longest match to the .
and you'd have the empty string.
有一个警告,如果你的文件名是这种形式,./somefile.tar.gz
那么echo ${FILENAME%%.*}
就会贪婪地删除最长的匹配项.
,你就会得到空字符串。
(You can work around that with a temporary variable:
(您可以使用临时变量解决这个问题:
FULL_FILENAME=$FILENAME
FILENAME=${FULL_FILENAME##*/}
echo ${FILENAME%%.*}
)
)
This siteexplains more.
这个网站解释了更多。
${variable%pattern}
Trim the shortest match from the end
${variable##pattern}
Trim the longest match from the beginning
${variable%%pattern}
Trim the longest match from the end
${variable#pattern}
Trim the shortest match from the beginning
回答by Doctor J
That doesn't seem to work if the file has no extension, or no filename. Here is what I'm using; it only uses builtins and handles more (but not all) pathological filenames.
如果文件没有扩展名或没有文件名,这似乎不起作用。这是我正在使用的;它只使用内置函数并处理更多(但不是全部)病态文件名。
#!/bin/bash
for fullpath in "$@"
do
filename="${fullpath##*/}" # Strip longest match of */ from start
dir="${fullpath:0:${#fullpath} - ${#filename}}" # Substring from 0 thru pos of filename
base="${filename%.[^.]*}" # Strip shortest match of . plus at least one non-dot char from end
ext="${filename:${#base} + 1}" # Substring from len of base thru end
if [[ -z "$base" && -n "$ext" ]]; then # If we have an extension and no base, it's really the base
base=".$ext"
ext=""
fi
echo -e "$fullpath:\n\tdir = \"$dir\"\n\tbase = \"$base\"\n\text = \"$ext\""
done
And here are some testcases:
这里有一些测试用例:
$ basename-and-extension.sh / /home/me/ /home/me/file /home/me/file.tar /home/me/file.tar.gz /home/me/.hidden /home/me/.hidden.tar /home/me/.. . /: dir = "/" base = "" ext = "" /home/me/: dir = "/home/me/" base = "" ext = "" /home/me/file: dir = "/home/me/" base = "file" ext = "" /home/me/file.tar: dir = "/home/me/" base = "file" ext = "tar" /home/me/file.tar.gz: dir = "/home/me/" base = "file.tar" ext = "gz" /home/me/.hidden: dir = "/home/me/" base = ".hidden" ext = "" /home/me/.hidden.tar: dir = "/home/me/" base = ".hidden" ext = "tar" /home/me/..: dir = "/home/me/" base = ".." ext = "" .: dir = "" base = "." ext = ""
回答by Bjarke Freund-Hansen
You can use basename
.
您可以使用basename
.
Example:
例子:
$ basename foo-bar.tar.gz .tar.gz
foo-bar
You do need to provide basename with the extension that shall be removed, however if you are always executing tar
with -z
then you know the extension will be .tar.gz
.
您确实需要为 basename 提供应删除的扩展名,但是如果您总是执行tar
with-z
那么您知道扩展名将是.tar.gz
.
This should do what you want:
这应该做你想做的:
tar -zxvf
cd $(basename .tar.gz)
回答by paxdiablo
pax> echo a.b.js | sed 's/\.[^.]*$//'
a.b
pax> echo a.b.js | sed 's/^.*\.//'
js
works fine, so you can just use:
工作正常,所以你可以使用:
pax> FILE=a.b.js
pax> NAME=$(echo "$FILE" | sed 's/\.[^.]*$//')
pax> EXTENSION=$(echo "$FILE" | sed 's/^.*\.//')
pax> echo $NAME
a.b
pax> echo $EXTENSION
js
The commands, by the way, work as follows.
顺便说一下,这些命令的工作方式如下。
The command for NAME
substitutes a "."
character followed by any number of non-"."
characters up to the end of the line, with nothing (i.e., it removes everything from the final "."
to the end of the line, inclusive). This is basically a non-greedy substitution using regex trickery.
命令 forNAME
替换一个"."
字符,后跟任意数量的非"."
字符直到行尾,没有任何内容(即,它删除从"."
行尾到行尾的所有内容,包括在内)。这基本上是使用正则表达式技巧的非贪婪替换。
The command for EXTENSION
substitutes a any number of characters followed by a "."
character at the start of the line, with nothing (i.e., it removes everything from the start of the line to the final dot, inclusive). This is a greedy substitution which is the default action.
命令 forEXTENSION
替换任意数量的字符,然后是行首的一个"."
字符,没有任何内容(即,它删除从行首到最后一个点的所有内容,包括在内)。这是一个贪婪的替换,它是默认操作。
回答by Kebabbert
Mellen writes in a comment on a blog post:
梅伦在一篇博客文章的评论中写道:
Using Bash, there's also ${file%.*}
to get the filename without the extension and ${file##*.}
to get the extension alone. That is,
使用 Bash,还${file%.*}
可以获取不带扩展名的文件名并${file##*.}
单独获取扩展名。那是,
file="thisfile.txt"
echo "filename: ${file%.*}"
echo "extension: ${file##*.}"
Outputs:
输出:
filename: thisfile
extension: txt
回答by Cyker
No need to bother with awk
or sed
or even perl
for this simple task. There is a pure-Bash, os.path.splitext()
-compatible solution which only uses parameter expansions.
无需费心awk
或者sed
甚至perl
为这个简单的任务。有一个纯 Bashos.path.splitext()
兼容的解决方案,它只使用参数扩展。
Reference Implementation
参考实现
Documentation of os.path.splitext(path)
:
Split the pathname path into a pair
(root, ext)
such thatroot + ext == path
, and extis empty or begins with a period and contains at most one period. Leading periods on the basename are ignored;splitext('.cshrc')
returns('.cshrc', '')
.
将路径名路径拆分为一对
(root, ext)
,使得root + ext == path
, 和ext为空或以句点开头并且最多包含一个句点。基本名称上的前导句点将被忽略;splitext('.cshrc')
返回('.cshrc', '')
。
Python code:
蟒蛇代码:
root, ext = os.path.splitext(path)
Bash Implementation
Bash 实现
Honoring leading periods
表彰领先时期
root="${path%.*}"
ext="${path#"$root"}"
Ignoring leading periods
忽略领先期
root="${path#.}";root="${path%"$root"}${root%.*}"
ext="${path#"$root"}"
Tests
测试
Here are test cases for the Ignoring leading periodsimplementation, which should match the Python reference implementation on every input.
以下是忽略领先期实现的测试用例,它应该与每个输入的 Python 参考实现相匹配。
|---------------|-----------|-------|
|path |root |ext |
|---------------|-----------|-------|
|' .txt' |' ' |'.txt' |
|' .txt.txt' |' .txt' |'.txt' |
|' txt' |' txt' |'' |
|'*.txt.txt' |'*.txt' |'.txt' |
|'.cshrc' |'.cshrc' |'' |
|'.txt' |'.txt' |'' |
|'?.txt.txt' |'?.txt' |'.txt' |
|'\n.txt.txt' |'\n.txt' |'.txt' |
|'\t.txt.txt' |'\t.txt' |'.txt' |
|'a b.txt.txt' |'a b.txt' |'.txt' |
|'a*b.txt.txt' |'a*b.txt' |'.txt' |
|'a?b.txt.txt' |'a?b.txt' |'.txt' |
|'a\nb.txt.txt' |'a\nb.txt' |'.txt' |
|'a\tb.txt.txt' |'a\tb.txt' |'.txt' |
|'txt' |'txt' |'' |
|'txt.pdf' |'txt' |'.pdf' |
|'txt.tar.gz' |'txt.tar' |'.gz' |
|'txt.txt' |'txt' |'.txt' |
|---------------|-----------|-------|
Test Results
检测结果
All tests passed.
所有测试都通过了。
回答by Some programmer dude
You could use the cut
command to remove the last two extensions (the ".tar.gz"
part):
您可以使用该cut
命令删除最后两个扩展名(".tar.gz"
部分):
$ echo "foo.tar.gz" | cut -d'.' --complement -f2-
foo
As noted by Clayton Hughes in a comment, this will not work for the actual example in the question. So as an alternative I propose using sed
with extended regular expressions, like this:
正如克莱顿休斯在评论中指出的那样,这不适用于问题中的实际示例。因此,作为替代方案,我建议使用sed
扩展正则表达式,如下所示:
$ echo "mpc-1.0.1.tar.gz" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'
mpc-1.0.1
It works by removing the last two (alpha-numeric) extensions unconditionally.
它的工作原理是无条件删除最后两个(字母数字)扩展名。
[Updated again after comment from Anders Lindahl]
[在 Anders Lindahl 发表评论后再次更新]