从命令行合并多个 XML 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9004135/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 15:07:51  来源:igfitidea点击:

Merge multiple XML files from command line

xmlcommand-linemerge

提问by TutanRamon

I have several xml files. They all have the same structure, but were splitted due to file size. So, let's say I have A.xml, B.xml, C.xmland D.xmland want to combine/merge them to combined.xml, using a command line tool.

我有几个 xml 文件。它们都具有相同的结构,但由于文件大小而被拆分。所以,让我们说我有A.xmlB.xmlC.xmlD.xml和要合并/它们合并到combined.xml,使用命令行工具。

A.xml

xml文件

<products>
    <product id="1234"></product>
    ...
</products>

B.xml

xml文件

<products>
  <product id="5678"></product>
  ...
</products>

etc.

等等。

采纳答案by berk

xml_grep

xml_grep

http://search.cpan.org/dist/XML-Twig/tools/xml_grep/xml_grep

http://search.cpan.org/dist/XML-Twig/tools/xml_grep/xml_grep

xml_grep--pretty_print indented --wrap products --descr '' --cond "product" *.xml > combined.xml

xml_grep--pretty_print 缩进 --wrap products --descr '' --cond "product" *.xml > combine.xml

  • --wrap : encloses/wraps the the xml result with the given tag. (here: products)
  • --cond : the xml subtree to grep (here: product)
  • --wrap : 用给定的标签包含/包装 xml 结果。(这里products
  • --cond:将XML子树到grep(这里product

回答by eswald

High-tech answer:

高科技答案:

Save this Python script as xmlcombine.py:

将此 Python 脚本保存为 xmlcombine.py:

#!/usr/bin/env python
import sys
from xml.etree import ElementTree

def run(files):
    first = None
    for filename in files:
        data = ElementTree.parse(filename).getroot()
        if first is None:
            first = data
        else:
            first.extend(data)
    if first is not None:
        print ElementTree.tostring(first)

if __name__ == "__main__":
    run(sys.argv[1:])

To combine files, run:

要合并文件,请运行:

python xmlcombine.py ?.xml > combined.xml

For further enhancement, consider using:

为了进一步增强,请考虑使用:

  • chmod +x xmlcombine.py: Allows you to omit pythonin the command line

  • xmlcombine.py !(combined).xml > combined.xml: Collects all XML files except the output, but requires bash's extgloboption

  • xmlcombine.py *.xml | sponge combined.xml: Collects everything in combined.xmlas well, but requires the spongeprogram

  • import lxml.etree as ElementTree: Uses a potentially faster XML parser

  • chmod +x xmlcombine.py: 允许你python在命令行中省略

  • xmlcombine.py !(combined).xml > combined.xml: 收集除输出之外的所有 XML 文件,但需要 bash 的extglob选项

  • xmlcombine.py *.xml | sponge combined.xml:也收集所有东西combined.xml,但需要sponge程序

  • import lxml.etree as ElementTree:使用可能更快的 XML 解析器

回答by eswald

Low-tech simple answer:

低技术简单的答案:

echo '<products>' > combined.xml
grep -vh '</\?products>\|<?xml' *.xml >> combined.xml
echo '</products>' >> combined.xml

Limitations:

限制:

  • The opening and closing tags need to be on their own line.
  • The files need to all have the same outer tags.
  • The outer tags must not have attributes.
  • The files must not have inner tags that match the outer tags.
  • Any current contents of combined.xmlwill be wiped out instead of getting included.
  • 开始和结束标签需要在自己的行上。
  • 这些文件都需要具有相同的外部标签。
  • 外部标签不能有属性。
  • 文件不得具有与外部标签匹配的内部标签。
  • 的任何当前内容都combined.xml将被清除而不是被包含在内。

Each of these limitations can be worked around, but not all of them easily.

这些限制中的每一个都可以解决,但并非所有限制都容易解决。