从 unix 命令行进行基本 xml 解析的最简单方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9200462/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Simplest way to do basic xml parsing from unix command line
提问by jonderry
I'm searching for xml files that have certain properties. For example, files that contain the following pattern:
我正在搜索具有某些属性的 xml 文件。例如,包含以下模式的文件:
<param-value>
<name>Hosts</name>
<description>some description</description>
<value></value>
</param-value>
For such files, I'd like to parse the value of another tag, such as:
对于此类文件,我想解析另一个标签的值,例如:
<param-value>
<name>Roles</name>
<description>some description</description>
<value>asdf</value>
</param-value>
And print out the file name along with "asdf". What's the simplest way to accomplish this from the command line?
并将文件名与“asdf”一起打印出来。从命令行完成此操作的最简单方法是什么?
One approach I was thinking of was just using grep with the -l option to filter the matching files out, and then using xargs grep to extract the value of Roles. However, grep doesn't work well with multi-line regexes. I saw another question that showed it could be done with the -Pzo options, but didn't have any luck getting it to work in my case. Is there a simpler approach?
我想到的一种方法是使用带有 -l 选项的 grep 来过滤匹配的文件,然后使用 xargs grep 来提取 Roles 的值。但是,grep 不适用于多行正则表达式。我看到另一个问题,表明它可以用 -Pzo 选项完成,但没有运气让它在我的情况下工作。有没有更简单的方法?
回答by Mark O'Connor
The following linux command uses XPath to access specified values within the XML file
以下 linux 命令使用 XPath 访问 XML 文件中的指定值
for xml in `find . -name "*.xml"`
do
echo $xml `xmllint --xpath "/param-value/value/text()" $xml`| awk 'NF>1'
done
Example output for matching XML files:
匹配 XML 文件的示例输出:
./test1.xml asdf
./test4.xml 1234
回答by jonderry
I worked out a couple of solutions using basic perl/awk functionality (basically a poor man's parsing of the tags). If you see any improvements using only basic perl/awk functionality, let me know. I avoided dealing with multiline regular expressions by setting a flag with I see a particular tag. Kind of clumsy but it works.
我使用基本的 perl/awk 功能(基本上是一个穷人对标签的解析)制定了几个解决方案。如果您发现仅使用基本的 perl/awk 功能有任何改进,请告诉我。我通过设置一个标志来避免处理多行正则表达式,我看到了一个特定的标签。有点笨拙,但它有效。
perl:
珀尔:
perl -ne '$h = 1 if m/Host/; $r = 1 if m/Role/; if ($h && m/<value>/) { $h = 0; print "hosts: ", $_ =~ /<value>(.*)</, "\n"}; if ($r && m/<value>/) { $r = 0; print "\nrole: ", $_ =~ /<value>(.*)</, "\n" }'
awk:
awk:
awk '/Host/ {h = 1} /Role/ {r = 1} h && /<value>/ {h = 0; match($ xmlstarlet ed -u /param-value/name -v Roles -u /param-value/value -v asdf data.xml
<?xml version="1.0"?>
<param-value>
<name>Roles</name>
<description>some description</description>
<value>asdf</value>
</param-value>
, "<value>(.*)<", a); print "hosts: " a[1]} r && /<value>/ {r = 0; match(for my $file in { glob "*.xml" } {
open $file ;
my $param_value = //param-value[name="Hosts"] ;
if $param_value echo $file $value/value ;
}
, "<value>(.*)<", a); print "\nrole: " a[1]}'

