用于 XML 命令行处理的 Grep 和 Sed 等效项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/91791/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Grep and Sed Equivalent for XML Command Line Processing
提问by Joseph Holsten
When doing shell scripting, typically data will be in files of single line records like csv. It's really simple to handle this data with grepand sed. But I have to deal with XML often, so I'd really like a way to script access to that XML data via the command line. What are the best tools?
在执行 shell 脚本时,通常数据将位于单行记录的文件中,如 csv。用grep和处理这些数据真的很简单sed。但是我必须经常处理 XML,所以我真的很想要一种通过命令行脚本访问 XML 数据的方法。最好的工具是什么?
采纳答案by Russ
I've found xmlstarlet to be pretty good at this sort of thing.
我发现 xmlstarlet 非常擅长这种事情。
http://xmlstar.sourceforge.net/
http://xmlstar.sourceforge.net/
Should be available in most distro repositories, too. An introductory tutorial is here:
也应该在大多数发行版存储库中可用。介绍性教程在这里:
回答by Joseph Holsten
Some promising tools:
一些有前途的工具:
nokogiri: parsing HTML/XML DOMs in ruby using XPath & CSS selectors
hpricot: deprecated
fxgrep: Uses its own XPath-like syntax to query documents. Written in SML, so installation may be difficult.
LT XML: XML toolkit derived from SGML tools, including
sggrep,sgsort,xmlnormand others. Uses its own query syntax. The documentation is veryformal. Written in C. LT XML 2 claims support of XPath, XInclude and other W3C standards.xmlgrep2: simple and powerful searching with XPath. Written in Perl using XML::LibXML and libxml2.
XQSharp: Supports XQuery, the extension to XPath. Written for the .NET Framework.
xml-coreutils: Laird Breyer's toolkit equivalent to GNU coreutils. Discussed in an interesting essayon what the ideal toolkit should include.
xmldiff: Simple tool for comparing two xml files.
xmltk: doesn't seem to have package in debian, ubuntu, fedora, or macports, hasn't had a release since 2007, and uses non-portable build automation.
nokogiri:使用 XPath 和 CSS 选择器解析 ruby 中的 HTML/XML DOM
hpricot: 已弃用
fxgrep:使用它自己的类似 XPath 的语法来查询文档。用 SML 编写,因此安装可能会很困难。
LT XML:从SGML工具,包括衍生XML工具箱
sggrep,sgsort,xmlnorm和其他人。使用它自己的查询语法。文档 非常正式。用 C 编写。LT XML 2 声称支持 XPath、XInclude 和其他 W3C 标准。xmlgrep2:使用 XPath 进行简单而强大的搜索。使用 XML::LibXML 和 libxml2 用 Perl 编写。
XQSharp:支持 XQuery,XPath 的扩展。为 .NET Framework 编写。
xml-coreutils:Laird Breyer 的工具包,相当于 GNU coreutils。在一篇有趣的文章中讨论了理想的工具包应该包括什么。
xmldiff:用于比较两个 xml 文件的简单工具。
xmltk:在 debian、ubuntu、fedora 或 macports 中似乎没有包,自 2007 年以来没有发布,并且使用非便携式构建自动化。
xml-coreutils seems the best documented and most UNIX-oriented.
xml-coreutils 似乎是最好的文档并且最面向 UNIX。
回答by bortzmeyer
To Joseph Holsten's excellent list, I add the xpath command-line script which comes with Perl library XML::XPath. A great way to extract information from XML files:
在 Joseph Holsten 的优秀列表中,我添加了 Perl 库 XML::XPath 附带的 xpath 命令行脚本。从 XML 文件中提取信息的好方法:
xpath -q -e '/entry[@xml:lang="fr"]' *xml
回答by Vi.
There is also xml2and 2xmlpair. It will allow usual string editing tools to process XML.
还有xml2和2xml对。它将允许通常的字符串编辑工具来处理 XML。
Example. q.xml:
例子。q.xml:
<?xml version="1.0"?>
<foo>
text
more text
<textnode>ddd</textnode><textnode a="bv">dsss</textnode>
<![CDATA[ asfdasdsa <foo> sdfsdfdsf <bar> ]]>
</foo>
xml2 < q.xml
xml2 < q.xml
/foo=
/foo= text
/foo= more text
/foo=
/foo/textnode=ddd
/foo/textnode
/foo/textnode/@a=bv
/foo/textnode=dsss
/foo=
/foo= asfdasdsa <foo> sdfsdfdsf <bar>
/foo=
xml2 < q.xml | grep textnode | sed 's!/foo!/bar/baz!' | 2xml
xml2 < q.xml | grep textnode | sed 's!/foo!/bar/baz!' | 2xml
<bar><baz><textnode>ddd</textnode><textnode a="bv">dsss</textnode></baz></bar>
P.S. There are also html2/ 2html.
PS 还有html2/ 2html。
回答by Dave Jarvis
You can use xmllint:
您可以使用 xmllint:
xmllint --xpath //title books.xml
Should be bundled with most distros, and is also bundled with Cygwin.
应该与大多数发行版捆绑在一起,并且也与 Cygwin 捆绑在一起。
$ xmllint --version
xmllint: using libxml version 20900
See:
看:
$ xmllint
Usage : xmllint [options] XMLfiles ...
Parse the XML files and output the result of the parsing
--version : display the version of the XML library used
--debug : dump a debug tree of the in-memory document
...
--schematron schema : do validation against a schematron
--sax1: use the old SAX1 interfaces for processing
--sax: do not build a tree but work just at the SAX level
--oldxml10: use XML-1.0 parsing rules before the 5th edition
--xpath expr: evaluate the XPath expression, inply --noout
回答by Clay
If you're looking for a solution on Windows, Powershell has built-in functionality for reading and writing XML.
如果您正在寻找 Windows 上的解决方案,Powershell 具有用于读取和写入 XML 的内置功能。
test.xml:
测试.xml:
<root>
<one>I like applesauce</one>
<two>You sure bet I do!</two>
</root>
Powershell script:
Powershell 脚本:
# load XML file into local variable and cast as XML type.
$doc = [xml](Get-Content ./test.xml)
$doc.root.one #echoes "I like applesauce"
$doc.root.one = "Who doesn't like applesauce?" #replace inner text of <one> node
# create new node...
$newNode = $doc.CreateElement("three")
$newNode.set_InnerText("And don't you forget it!")
# ...and position it in the hierarchy
$doc.root.AppendChild($newNode)
# write results to disk
$doc.save("./testNew.xml")
testNew.xml:
testNew.xml:
<root>
<one>Who likes applesauce?</one>
<two>You sure bet I do!</two>
<three>And don't you forget it!</three>
</root>
Source: https://serverfault.com/questions/26976/update-xml-from-the-command-line-windows
来源:https: //serverfault.com/questions/26976/update-xml-from-the-command-line-windows
回答by taggo
There're also xmlsed & xmlgrep of the NetBSD xmltools!
还有 NetBSD xmltools 的 xmlsed 和 xmlgrep!
回答by Adrian Mouat
回答by Gilles Quenot
There's also saxon-lintfrom command linewith the ability to use XPath 3.0/XQuery 3.0. (Other command-line tools use XPath 1.0).
还有saxon-lint来自命令行的功能,可以使用 XPath 3.0/XQuery 3.0。(其他命令行工具使用 XPath 1.0)。
EXAMPLES :
例子 :
http/html:
http/html:
$ saxon-lint --html --xpath 'count(//a)' http://stackoverflow.com/q/91791
328
xml :
xml :
$ saxon-lint --xpath '//a[@class="x"]' file.xml
回答by Devy
D. Bohdan maintains an open source GitHub repo that keeps a list of command line tools for structured text tools, there a section for XML/HTML tools:
D. Bohdan 维护着一个开源 GitHub 存储库,其中包含用于结构化文本工具的命令行工具列表,其中有一个用于 XML/HTML 工具的部分:

