如何使用 bash 脚本编辑 XML?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6873070/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 14:52:05  来源:igfitidea点击:

how to edit XML using bash script?

xmlparsingbashsh

提问by Roman

<root>
<tag>1</tag>
<tag1>2</tag1>
</root>

Need to change values 1 and 2 from bash

需要从 bash 更改值 1 和 2

回答by Charles Duffy

To change tag's value to 2and tag1's value to 3, using XMLStarlet:

要将tag的值更改为2tag1的值更改为3,请使用XMLStarlet

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <old.xml >new.xml


Using your sample input:

使用您的样本输入:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <<<'<root><tag>1</tag><tag1>2</tag1></root>'

...emits as output:

...作为输出发出:

<?xml version="1.0"?>
<root>
  <tag>2</tag>
  <tag1>3</tag1>
</root>

回答by Mithfindel

You can use the xsltproccommand (from package xsltprocon Debian-based distros) with the following XSLT sheet:

您可以使用带有以下 XSLT 表的xsltproc命令(来自xsltproc基于 Debian 的发行版上的包):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:param name="tagReplacement"/>
  <xsl:param name="tag1Replacement"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>

  </xsl:template>
  <xsl:template match="tag">
    <xsl:copy>
      <xsl:value-of select="$tagReplacement"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="tag1">
    <xsl:copy>
      <xsl:value-of select="$tag1Replacement"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Then use the command:

然后使用命令:

xsltproc --stringparam tagReplacement polop \
         --stringparam tag1Replacement palap \
         transform.xsl input.xml

Or you could also use regexes, but modifying XML through regexes is pure evil :)

或者你也可以使用正则表达式,但通过正则表达式修改 XML 是完全邪恶的 :)

回答by stringy05

my $0.02 in python because its on every server you will ever log in to

我在 python 中 0.02 美元,因为它在你登录的每台服务器上

import sys, xml.etree.ElementTree as ET

data = ""
for line in sys.stdin:
    data += line

tree = ET.fromstring(data)

nodeA = tree.find('.//tag')
nodeB = tree.find('.//tag1')

tmp = nodeA.text
nodeA.text = nodeB.text
nodeB.text = tmp 

print ET.tostring(tree)

this reads from stdin so you can use it like this:

这从 stdin 读取,因此您可以像这样使用它:

$ echo '<node><tag1>hi!</tag1><tag>this</tag></node>' | python xml_process.py 
<node><tag1>this</tag1><tag>hi!</tag></node>

EDIT - challenge accepted

编辑 - 接受挑战

Here's a working xmllib implementation (should work back to python 1.6). As I thought it would be more fun to stab my eyes with a fork. The only think I will say about this is it works for the given use case.

这是一个有效的 xmllib 实现(应该可以返回到 python 1.6)。因为我认为用叉子刺我的眼睛会更有趣。我唯一想说的是它适用于给定的用例。

import sys, xmllib

class Bag:
    pass

class NodeSwapper(xmllib.XMLParser):
    def __init__(self):
    print 'making a NodeSwapper'
    xmllib.XMLParser.__init__(self)
    self.result = ''
    self.data_tags = {}
    self.current_tag = ''
    self.finished = False

    def handle_data(self, data):
    print 'data: ' + data

    self.data_tags[self.current_tag] = data
    if self.finished:
       return

    if 'tag1' in self.data_tags.keys() and 'tag' in self.data_tags.keys():
        b = Bag()
        b.tag1 = self.data_tags['tag1']
        b.tag = self.data_tags['tag']
        b.t1_start_idx = self.rawdata.find(b.tag1)
        b.t1_end_idx = len(b.tag1) + b.t1_start_idx
        b.t_start_idx = self.rawdata.find(b.tag)
        b.t_end_idx = len(b.tag) +  b.t_start_idx 
        # swap
        if b.t1_start_idx < b.t_start_idx:
           self.result = self.rawdata[:b.t_start_idx] + b.tag + self.rawdata[b.t_end_idx:]
           self.result = self.result[:b.t1_start_idx] + b.tag1 + self.result[b.t1_end_idx:]
        else:
           self.result = self.rawdata[:b.t1_start_idx] + b.tag1 + self.rawdata[t1_end_idx:]
           self.result = self.result[:b.t_start_idx] + b.tag + self.rresult[t_end_idx:]
        self.finished = True

    def unknown_starttag(self, tag, attrs):
    print 'starttag is: ' + tag
    self.current_tag = tag

data = ""
for line in sys.stdin:
    data += line

print 'data is: ' + data

parser = NodeSwapper()
parser.feed(data)
print parser.result
parser.close()

回答by tripleee

Since you give a sedexample in one of the comments, I imagine you want a pure bash solution?

既然您sed在其中一条评论中举了一个例子,我想您想要一个纯 bash 解决方案?

while read input; do
  for field in tag tag1; do
    case $input in
      *"<$field>"*"</$field>"* )
        pre=${input#*"<$field>"}
        suf=${input%"</$field>"*}
        # Where are we supposed to be getting the replacement text from?
        input="${input%$pre}SOMETHING${input#$suf}"
        ;;
    esac
  done
  echo "$input"
done

This is completely unintelligent, and obviously only works on well-formed input with the start tag and the end tag on the same line, you can't have multiple instances of the same tag on the same line, the list of tags to substitute is hard-coded, etc.

这是完全不智能的,显然只适用于在同一行上具有开始标记和结束标记的格式良好的输入,同一行上不能有多个相同标记的实例,要替换的标记列表是硬编码等

I cannot imagine a situation where this would be actually useful, and preferable to either a script or a proper XML approach.

我无法想象这种情况会实际有用,并且比脚本或适当的 XML 方法更可取。