bash 解析文件名

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10618015/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 02:17:09  来源:igfitidea点击:

bash parse filename

bashparsingtokenize

提问by pufos

Is there any way in bash to parse this filename :

bash 有没有办法解析这个文件名:

$file = dos1-20120514104538.csv.3310686

$file = dos1-20120514104538.csv.3310686

into variables like $date = 2012-05-14 10:45:38and $id = 3310686?

变成像$date = 2012-05-14 10:45:38和这样的变量$id = 3310686

Thank you

谢谢

回答by kojiro

All of this can be done with Parameter Expansion. Please read about it in the bash manpage.

所有这些都可以通过参数扩展来完成。请在 bash 联机帮助页中阅读它。

$ file='dos1-20120514104538.csv.3310686'
$ date="${file#*-}" # Use Parameter Expansion to strip off the part before '-'
$ date="${date%%.*}" # Use PE again to strip after the first '.'
$ id="${file##*.}" # Use PE to get the id as the part after the last '.'
$ echo "$date"
20120514104538
$ echo "$id"
3310686

Combine PEs to put date back together in a new format. You could also parse the date with GNU date, but that would still require rearranging the date so it can be parsed. In its current format, this is how I would approach it:

组合 PE 以新格式将日期重新组合在一起。您也可以使用 GNU 日期解析日期,但这仍然需要重新排列日期以便可以解析它。在目前的格式中,这就是我的处理方式:

$ date="${date:0:4}-${date:4:2}-${date:6:2} ${date:8:2}:${date:10:2}:${date:12:2}"
$ echo "$date"
2012-05-14 10:45:38

回答by user unknown

Extract id:

提取标识:

f='dos1-20120514104538.csv.3310686'
echo ${f/*./}
# 3310686
id=${f/*./}

Remove prefix, and extract core date numbers:

删除前缀,并提取核心日期数字:

noprefix=${f/*-/}
echo ${noprefix/.csv*/}
# 20120514104538
ds=${noprefix/.csv*/}

format the date like this (only partially done:)

像这样格式化日期(仅部分完成:)

echo $ds | sed -r 's/(.{4})(.{2})(.{2})/../'


You can alternatively split the initial variable into an array,

您也可以将初始变量拆分为一个数组,

echo $f
# dos1-20120514104538.csv.3310686

after exchanging - and . like this:

交换后 - 和 。像这样:

echo ${f//[-.]/ }
# dos1 20120514104538 csv 3310686

ar=(${f//[-.]/ })
echo ${ar[1]}
# 20120514104538

echo ${ar[3]}
# 3310686


The date transformation can be done via an array similarly:

日期转换可以类似地通过数组完成:

dp=($(echo 20120514104538  | sed -r 's/(.{2})/ /g'))
echo ${dp[0]}${dp[1]}-${dp[2]}-${dp[3]} ${dp[4]}:${dp[5]}:${dp[6]}

It splits everything into groups of 2 characters:

它将所有内容分成 2 个字符的组:

echo ${dp[@]}
# 20 12 05 14 10 45 38

and merges 2012 together in the output.

并在输出中将 2012 合并在一起。

回答by Paused until further notice.

Using Bash's regular expression feature:

使用 Bash 的正则表达式功能:

file='dos1-20120514104538.csv.3310686'
pattern='^[^-]+-([[:digit:]]{4})'
for i in {1..5}
do
    pattern+='([[:digit:]]{2})'
done
pattern+='\.[^.]+\.([[:digit:]]+)$'
[[ $file =~ $pattern ]]
read -r _ Y m d H M S id <<< "${BASH_REMATCH[@]}"
date="$Y-$m-$d $H:$M:$S"
echo "$date"
echo "$id"

回答by Ozair Kafray

You can tokenize the string first for -and then for .. There are various threads on SO on how to do this:

您可以先对字符串进行标记-,然后再对 进行标记.。关于如何执行此操作,SO 上有各种线程:

  1. How do I split a string on a delimiter in Bash?
  2. Bash: How to tokenize a string variable?
  1. 如何在 Bash 中的分隔符上拆分字符串?
  2. Bash:如何标记字符串变量?

To transform 20120514104538into 2012-05-14 10:45:38:

转化201205141045382012-05-14 10:45:38

Since we know that first 4 characters is year, next 2 is months and so on, you will first need to break this tokeninto sub-strings and then recombine into a single string. You can start with the following answer:

由于我们知道前 4 个字符是年份,接下来的 2 个字符是月份等等,因此您首先需要将此标记分解为子字符串,然后重新组合成单​​个字符串。您可以从以下答案开始:

  1. https://stackoverflow.com/a/428580/365188
  1. https://stackoverflow.com/a/428580/365188