Linux 如何从字符串中提取数字?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17883661/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to extract numbers from a string?
提问by MOHAMED
I have string contains a path
我有字符串包含路径
string="toto.titi.12.tata.2.abc.def"
I want to extract only the numbers from this string.
我只想从这个字符串中提取数字。
To extract the first number:
提取第一个数字:
tmp="${string#toto.titi.*.}"
num1="${tmp%.tata*}"
To extract the second number:
提取第二个数字:
tmp="${string#toto.titi.*.tata.*.}"
num2="${tmp%.abc.def}"
So to extract a parameter I have to do it in 2 steps. How to extract a number with one step?
所以要提取一个参数,我必须分两步完成。如何一步提取一个数字?
回答by drldcsta
This would be easier to answer if you provided exactly the output you're looking to get. If you mean you want to get just the digits out of the string, and remove everything else, you can do this:
如果您准确提供了您想要获得的输出,这将更容易回答。如果你的意思是你只想从字符串中取出数字,并删除其他所有内容,你可以这样做:
d@AirBox:~$ string="toto.titi.12.tata.2.abc.def"
d@AirBox:~$ echo "${string//[a-z,.]/}"
122
If you clarify a bit I may be able to help more.
如果你澄清一点,我可能会提供更多帮助。
回答by mti2935
You can use tr
to delete all of the non-digit characters, like so:
您可以使用tr
删除所有非数字字符,如下所示:
echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9
回答by chepner
Use regular expression matching:
使用正则表达式匹配:
string="toto.titi.12.tata.2.abc.def"
[[ $string =~ toto\.titi\.([0-9]+)\.tata\.([0-9]+)\. ]]
# BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match
# Successive elements of the array correspond to the parenthesized
# subexpressions, in left-to-right order. (If there are nested parentheses,
# they are numbered in depth-first order.)
first_number=${BASH_REMATCH[1]}
second_number=${BASH_REMATCH[2]}
回答by anubhava
Using awk:
使用 awk:
arr=( $(echo $string | awk -F "." '{print , }') )
num1=${arr[0]}
num2=${arr[1]}
回答by jderefinko
You can also use sed:
您还可以使用 sed:
echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g'
Here, sed replaces
在这里, sed 替换
- any digits (class
[0-9]
) - repeated any number of times (
*
) - with nothing (nothing between the second and third
/
), - and
g
stands for globally.
- 任何数字(类
[0-9]
) - 重复任意次数 (
*
) - 什么都没有(第二个和第三个之间没有
/
), - 并
g
代表全球。
Output will be:
输出将是:
toto.titi..tata..abc.def
回答by ghoti
Parameter expansion would seem to be the order of the day.
参数扩展似乎是当务之急。
$ string="toto.titi.12.tata.2.abc.def"
$ read num1 num2 <<<${string//[^0-9]/ }
$ echo "$num1 / $num2"
12 / 2
This of course depends on the format of $string
. But at least for the example you've provided, it seems to work.
这当然取决于$string
. 但至少对于您提供的示例,它似乎有效。
This may be superior to anubhava's awk solution which requires a subshell. I also like chepner's solution, but regular expressions are "heavier" than parameter expansion (though obviously way more precise). (Note that in the expression above, [^0-9]
may looklike a regex atom, but it is not.)
这可能优于需要子外壳的 anubhava 的 awk 解决方案。我也喜欢 chepner 的解决方案,但正则表达式比参数扩展“更重”(尽管显然更精确)。(请注意,在上面的表达式中,[^0-9]
可能看起来像一个正则表达式原子,但事实并非如此。)
You can read about this form or Parameter Expansion in the bash man page. Note that ${string//this/that}
(as well as the <<<
) is a bashism, and is not compatible with traditional Bourne or posix shells.
您可以在 bash 手册页中阅读有关此表单或参数扩展的信息。请注意${string//this/that}
(以及<<<
)是一种bashism,并且与传统的Bourne 或posix shell 不兼容。
回答by cchamberlain
To extract all the individual numbers and print one number word per line pipe through -
要提取所有单独的数字并通过 - 每行管道打印一个数字字 -
tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'
Breakdown:
分解:
- Replaces all line breaks with spaces:
tr '\n' ' '
- Replaces all non numbers with spaces:
sed -e 's/[^0-9]/ /g'
- Remove leading white space:
-e 's/^ *//g'
- Remove trailing white space:
-e 's/ *$//g'
- Squeeze spaces in sequence to 1 space:
tr -s ' '
- Replace remaining space separators with line break:
sed 's/ /\n/g'
- 用空格替换所有换行符:
tr '\n' ' '
- 用空格替换所有非数字:
sed -e 's/[^0-9]/ /g'
- 删除前导空格:
-e 's/^ *//g'
- 删除尾随空格:
-e 's/ *$//g'
- 将空格依次压缩为 1 个空格:
tr -s ' '
- 用换行符替换剩余的空格分隔符:
sed 's/ /\n/g'
Example:
例子:
echo -e " this 20 is 2sen\nten324ce 2 sort of" | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'
Will print out
会打印出来
20
2
324
2
回答by Vivek-Ananth
Hi adding yet another way to do this using 'cut',
嗨,添加另一种使用“剪切”来做到这一点的方法,
echo $string | cut -d'.' -f3,5 | tr '.' ' '
This gives you the following output: 12 2
这为您提供以下输出: 12 2
回答by Adi Azarya
Here is a short one:
这是一个简短的:
string="toto.titi.12.tata.2.abc.def"
id=$(echo "$string" | grep -o -E '[0-9]+')
echo $id // => output: 12 2
with space between the numbers. Hope it helps...
数字之间有空格。希望能帮助到你...
回答by placidnick
Fixing newline issue (for mac terminal):
修复换行问题(对于 mac 终端):
cat temp.txt | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed $'s/ /\\n/g'