bash 使用 awk 从一行中提取值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25175047/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 11:04:11  来源:igfitidea点击:

Use awk to extract value from a line

bashawksed

提问by qu1x0tc

I have these two lines within a file:

我在文件中有这两行:

<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>

where I'd like to get the following as output using awk or sed:

我想使用 awk 或 sed 将以下内容作为输出:

3    
50000

Using this sed command does not work as I had hoped, and I suspect this is due to the presence of the quotes and delimiters in my line entry.

使用这个 sed 命令并没有像我希望的那样工作,我怀疑这是由于我的行条目中存在引号和分隔符。

sed -n '/WORD1/,/WORD2/p' /path/to/file

How can I extract the values I want from the file?

如何从文件中提取我想要的值?

回答by Ashkan

awk -F'[<>]' '{print }' input.txt

input.txt:

输入.txt:

<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>

Output:

输出:

3
50000

回答by Tom Fenech

Looks like XML to me, so assuming it forms part of some valid XML, e.g.

对我来说看起来像 XML,所以假设它是一些有效 XML 的一部分,例如

<root>
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
</root>

You can use Perl's XML::Simpleand do something like this:

您可以使用 Perl 的XML::Simple并执行以下操作:

perl -MXML::Simple -E '$xml = XMLin("file"); say $xml->{"first-value"}->{"content"}; say $xml->{"second-value-limit"}'

Output:

输出:

3
50000

If the XML structure is more complicated, then you may have to drill down a bit deeper to get to the values you want. If that's the case, you should edit the question to show the bigger picture.

如果 XML 结构更复杂,那么您可能需要更深入地钻取以获得所需的值。如果是这种情况,您应该编辑问题以显示更大的图景。

回答by jaybee

Ashkan's awksolution is straightforward, but let me suggest a sedsolution that accepts non-integer numbers:

Ashkan 的awk解决方案很简单,但让我建议一个接受非整数的sed解决方案:

sed -n 's/[^>]*>\([.[:digit:]]*\)<.*//p' input.txt

This extracts the number between the first >character of the line and the following <. In my RE this "number" can be the empty string, if you don't want to accept an empty string please add the -roption to sedand replace \([.[:digit:]]*\)by ([.[:digit:]]+).

这将提取>行的第一个字符和下一个字符之间的数字<。在我的 RE 中,这个“数字”可以是空字符串,如果您不想接受空字符串,请将-r选项添加到sed并替换\([.[:digit:]]*\)([.[:digit:]]+)

回答by Technext

Using sed:

使用sed

sed -E 's/.*limit"*>([0-9]+)<.*//' file


Explanation:
.*takes care of everything that comes before the string limit


说明:
.*处理字符串限制之前的所有内容

limit"*takes care of both the lines, one with limit"and the other one with just limit

limit"*处理两条线,一条线,limit"另一条线limit

([0-9]+)takes care of matching numbers and only numbers as stated in your requirement.

([0-9]+)处理匹配的数字,并且仅处理您的要求中所述的数字。

\1is actually a shortcut for capturing pattern. When a pattern groups all or part of its content into a pair of parentheses, it captures thatcontent and stores it temporarily in memory. For more details, please refer https://www.inkling.com/read/introducing-regular-expressions-michael-fitzgerald-1st/chapter-4/capturing-groups-and

\1实际上是捕获模式的快捷方式。当模式将其全部或部分内容分组到一对括号中时,它会捕获内容并将其临时存储在内存中。更多详情请参考https://www.inkling.com/read/introducing-regular-expressions-michael-fitzgerald-1st/chapter-4/capturing-groups-and

回答by vks

        sed -e 's/[a-zA-Z.<\/>= \-]//g' file

回答by David C. Rankin

The script solution with parameter expansion:

带参数扩展的脚本解决方案:

#!/bin/bash

while read line || test -n "$line" ; do
    value="${line%<*}"
    printf "%s\n" "${value##*\>}"
done <""

output:

输出:

$ ./ltags.sh dat/ltags.txt
3
50000