bash 使用 awk 从一行中提取值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25175047/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Use awk to extract value from a line
提问by qu1x0tc
I have these two lines within a file:
我在文件中有这两行:
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
where I'd like to get the following as output using awk or sed:
我想使用 awk 或 sed 将以下内容作为输出:
3
50000
Using this sed command does not work as I had hoped, and I suspect this is due to the presence of the quotes and delimiters in my line entry.
使用这个 sed 命令并没有像我希望的那样工作,我怀疑这是由于我的行条目中存在引号和分隔符。
sed -n '/WORD1/,/WORD2/p' /path/to/file
How can I extract the values I want from the file?
如何从文件中提取我想要的值?
回答by Ashkan
awk -F'[<>]' '{print }' input.txt
input.txt:
输入.txt:
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
Output:
输出:
3
50000
回答by Tom Fenech
Looks like XML to me, so assuming it forms part of some valid XML, e.g.
对我来说看起来像 XML,所以假设它是一些有效 XML 的一部分,例如
<root>
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
</root>
You can use Perl's XML::Simpleand do something like this:
您可以使用 Perl 的XML::Simple并执行以下操作:
perl -MXML::Simple -E '$xml = XMLin("file"); say $xml->{"first-value"}->{"content"}; say $xml->{"second-value-limit"}'
Output:
输出:
3
50000
If the XML structure is more complicated, then you may have to drill down a bit deeper to get to the values you want. If that's the case, you should edit the question to show the bigger picture.
如果 XML 结构更复杂,那么您可能需要更深入地钻取以获得所需的值。如果是这种情况,您应该编辑问题以显示更大的图景。
回答by jaybee
Ashkan's awksolution is straightforward, but let me suggest a sedsolution that accepts non-integer numbers:
Ashkan 的awk解决方案很简单,但让我建议一个接受非整数的sed解决方案:
sed -n 's/[^>]*>\([.[:digit:]]*\)<.*//p' input.txt
This extracts the number between the first >
character of the line and the following <
. In my RE this "number" can be the empty string, if you don't want to accept an empty string please add the -r
option to sedand replace \([.[:digit:]]*\)
by ([.[:digit:]]+)
.
这将提取>
行的第一个字符和下一个字符之间的数字<
。在我的 RE 中,这个“数字”可以是空字符串,如果您不想接受空字符串,请将-r
选项添加到sed并替换\([.[:digit:]]*\)
为([.[:digit:]]+)
。
回答by Technext
Using sed
:
使用sed
:
sed -E 's/.*limit"*>([0-9]+)<.*//' file
Explanation:.*
takes care of everything that comes before the string limit
说明:.*
处理字符串限制之前的所有内容
limit"*
takes care of both the lines, one with limit"
and the other one with just limit
limit"*
处理两条线,一条线,limit"
另一条线limit
([0-9]+)
takes care of matching numbers and only numbers as stated in your requirement.
([0-9]+)
处理匹配的数字,并且仅处理您的要求中所述的数字。
\1
is actually a shortcut for capturing pattern. When a pattern groups all or part of its content into a pair of parentheses, it captures thatcontent and stores it temporarily in memory. For more details, please refer https://www.inkling.com/read/introducing-regular-expressions-michael-fitzgerald-1st/chapter-4/capturing-groups-and
\1
实际上是捕获模式的快捷方式。当模式将其全部或部分内容分组到一对括号中时,它会捕获该内容并将其临时存储在内存中。更多详情请参考https://www.inkling.com/read/introducing-regular-expressions-michael-fitzgerald-1st/chapter-4/capturing-groups-and
回答by vks
sed -e 's/[a-zA-Z.<\/>= \-]//g' file
回答by David C. Rankin
The script solution with parameter expansion:
带参数扩展的脚本解决方案:
#!/bin/bash
while read line || test -n "$line" ; do
value="${line%<*}"
printf "%s\n" "${value##*\>}"
done <""
output:
输出:
$ ./ltags.sh dat/ltags.txt
3
50000