bash 从命令行列出 XML 节点的 XPath
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12012352/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Listing the XPath to an XML node from the command line
提问by Troy Harvey
Given this snippet of a large deeply nested XML document (bookstore.xml), I want to know the full path to the amazonnode. How can I print that path from the command line?
鉴于这个大型深度嵌套 XML 文档 (bookstore.xml) 的片段,我想知道amazon节点的完整路径。如何从命令行打印该路径?
<bookstore>
<book>
<title lang="eng">Learning XML</title>
<price>
<retail>39.95</retail>
<discounts>
<amazon>29.99</amazon>
</discounts>
<currency>USD</currency>
</price>
</book>
...
</bookstore>
Ideally it would look like this:
理想情况下,它看起来像这样:
old-gregg$ magic bookstore.xml amazon
/bookstore/book/price/discounts/amazon
回答by Troy Harvey
I found XMLStarletand it does exactly what I'm looking for here. To install it using Homebrew:
我找到了XMLStarlet,它正是我在这里寻找的东西。要使用Homebrew安装它:
$ brew update
$ brew install xmlstarlet
$ xml el bookstore.xml | grep amazon
/bookstore/book/price/discounts/amazon
回答by Theodros Zelleke
Use xmllintwhich is a command line tool bundled with libxml2. Very likely that its available on your system.
使用xmllint,它是一个与 libxml2 捆绑在一起的命令行工具。很可能它在您的系统上可用。
Based on your example data (deleted the ellipsis) I played around and managed the following:
根据您的示例数据(删除了省略号),我进行了以下操作并进行了管理:
echo -e "du\nbye\n" | \
xmllint --shell data
which returns
返回
/ > du
/
bookstore
book
title
price
retail
discounts
amazon
currency
/ > bye
This uses the interactive mode of the tool.duask to print the whole subtree starting from current node (here root).
byejust exits the program.
这将使用该工具的交互模式。du要求打印从当前节点(此处为根)开始的整个子树。
bye只是退出程序。
The next step is now to parse this output.
下一步是解析这个输出。
UPDATED:(assuming that the XML is in data)
Note that the node in question is currently hardcoded!
更新:(假设 XML 在 中data)
请注意,有问题的节点当前是硬编码的!
#!/bin/bash
echo -e "du\nbye\n" | \
xmllint --shell data | \
sed 's/ /: /g' | \
awk '
BEGIN {depth = 0}
$NF == "amazon" {
for(i=1; i<NF; i++) {printf("/%s", STACK[i])}
print "/" $NF
}
/^\// {next}
NF == depth + 1 {depth = NF; STACK[depth] = $NF; next}
NF == depth {STACK[depth] = $NF; next}
NF < depth {depth = NF; STACK[depth] = $NF; next}
1 {print "something went horribly wrong!"}
'
gives
给
/bookstore/book/price/discounts/amazon
To explain this look at the output after the sedcommand:
要解释这一点,请查看sed命令后的输出:
/ > du
/
bookstore
: book
: : title
: : price
: : : retail
: : : discounts
: : : : amazon
: : : currency
/ > bye
sedsubstitutes [two spaces]with [:space].
In the following it is simple to detect the depth with awk.
sed替代[two spaces]用[:space]。
下面用 检测深度很简单awk。
回答by BeniBela
In XPath 2.0 you can use //amazonto select the element /ancestor-or-self::*/node-name(.)to get the parent node names and string-join(..., "/")to get a path from it.
在 XPath 2.0 中,您可以使用//amazon选择元素/ancestor-or-self::*/node-name(.)来获取父节点名称并string-join(..., "/")从中获取路径。
So finally the XPath 2.0 expression
所以最后是 XPath 2.0 表达式
string-join(("",//amazon/ancestor-or-self::*/node-name(.)),"/")
will return exactly the path you want. (although it won't add [] attribute tests, if you need them, too)
将准确返回您想要的路径。(虽然它不会添加 [] 属性测试,如果你也需要它们)
I don't know if there is any other XPath 2.0 command line tool, but I made my own a few days ago. If you happen to have fpc, you can download the sourceand compile it (there are no binaries edit: now they are there linked there: http://videlibri.sourceforge.net/xidel.html). With it, you could just run:
我不知道是否还有其他 XPath 2.0 命令行工具,但我几天前自己做了一个。如果你碰巧有 fpc,你可以下载源代码并编译它(没有二进制文件编辑:现在它们在那里链接:http: //videlibri.sourceforge.net/xidel.html)。有了它,你可以运行:
xidel /tmp/so2.xml --extract 'string-join(("",//amazon/ancestor-or-self::*/node-name(.)),"/")'
I also made a CGI service you could try:
我还制作了一个 CGI 服务,您可以尝试:
wget -qO - 'http://videlibri.sourceforge.net/cgi-bin/xidelcgi?extract=string-join(("",//amazon/ancestor-or-self::*/node-name(.)),"/")&data=<bookstore><book> <title lang="eng">Learning XML</title> <price> <retail>39.95</retail> <discounts> <amazon>29.99</amazon> </discounts> <currency>USD</currency> </price></book></bookstore>'

