使用 Python 编辑 XML 文件中的 XML 文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1926512/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Editing the XML texts from a XML file using Python
提问by manoj1123
I have an XML file which contains some data as given.
我有一个 XML 文件,其中包含给定的一些数据。
<?xml version="1.0" encoding="UTF-8" ?>
- <ParameterData>
<CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" />
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
<Value>n/a</Value>
<Result>n/a</Result>
</Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
<Value>n/a</Value>
<Result>n/a</Result>
</Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
<Value>n/a</Value>
<Result>n/a</Result>
</Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
<Value>n/a</Value>
<Result>n/a</Result>
</Parameter>
</ParameterList>
</ParameterData>
I have one text file with lines as
我有一个带有行的文本文件
Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3
Now I want to edit XML texts; i,e. I want to replace the field (n/a) with the corresponding values from the text file. Like I want the file to looks like
现在我想编辑 XML 文本;IE。我想用文本文件中的相应值替换字段 (n/a)。就像我希望文件看起来像
<?xml version="1.0" encoding="UTF-8" ?>
- <ParameterData>
<CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" />
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
<Value>TRUE</Value>
<Result>TRUE</Result>
</Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
<Value>19-Flat2-HS3</Value>
<Result>19-Flat2-HS3</Result>
</Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
<Value>FALSE</Value>
<Result>FALSE</Result>
</Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
<Value>4-1-Bead1-HS3</Value>
<Result>4-1-Bead1-HS3</Result>
</Parameter>
</ParameterList>
</ParameterData>
I am new to this Python-XML coding. I dont have idea about how to edit the text fields in a XML file. I am trying to Use elementtree.ElementTree module. but to read the lines in XML file and extract the attributes I dont know which modules need to be imported.
我是这个 Python-XML 编码的新手。我不知道如何编辑 XML 文件中的文本字段。我正在尝试使用 elementtree.ElementTree 模块。但是要读取 XML 文件中的行并提取属性,我不知道需要导入哪些模块。
Please help.
请帮忙。
Thanks and Regards.
谢谢并恭祝安康。
采纳答案by YOU
You can convert your data text into python dictionary by regular expression
您可以通过正则表达式将数据文本转换为 python 字典
data="""Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3"""
#data=open("data.txt").read()
import re
data=dict(re.findall('(Spec \d+ (?:Included|Label))\s*:\s*(\S+)',data))
data
will be as follows
data
将如下
{'Spec 3 Included': 'FALSE', 'Spec 2 Included': 'TRUE', 'Spec 3 Label': '4-1-Bead1-HS3', 'Spec 2 Label': '19-Flat2-HS3'}
Then you can convert it by using any of your favoriate xml parser, I will use minidom here.
然后你可以使用任何你喜欢的 xml 解析器来转换它,我将在这里使用 minidom。
from xml.dom import minidom
dom = minidom.parseString(xml_text)
params=dom.getElementsByTagName("Parameter")
for param in params:
name=param.getAttribute("name")
if name in data:
for item in param.getElementsByTagName("*"): # You may change to "Result" or "Value" only
item.firstChild.replaceWholeText(data[name])
print dom.toxml()
#write to file
open("output.xml","wb").write(dom.toxml())
Results
结果
<?xml version="1.0" ?><ParameterData>
<CreationInfo date="10/28/2009 03:05:14 PM" user="manoj"/>
<ParameterList count="85">
<Parameter mode="both" name="Spec 2 Included" type="boolean">
<Value>TRUE</Value>
<Result>TRUE</Result>
</Parameter>
<Parameter mode="both" name="Spec 2 Label" type="string">
<Value>19-Flat2-HS3</Value>
<Result>19-Flat2-HS3</Result>
</Parameter>
<Parameter mode="both" name="Spec 3 Included" type="boolean">
<Value>FALSE</Value>
<Result>FALSE</Result>
</Parameter>
<Parameter mode="both" name="Spec 3 Label" type="string">
<Value>4-1-Bead1-HS3</Value>
<Result>4-1-Bead1-HS3</Result>
</Parameter>
</ParameterList>
</ParameterData>
回答by Jason Orendorff
Well, you could start with
那么,你可以从
import xml.etree.ElementTree as ET
tree = ET.parse("blah.xml")
Find the elementsyou want to modify.
To replace the contents of an element, just do
要替换元素的内容,只需执行
element.text = "TRUE"
The import statement above works in Python 2.5 or later. If you have an older version of Python you'll need to install ElementTree as an extension, and then the import statement is different: import elementtree.ElementTree as ET
.
上面的 import 语句适用于 Python 2.5 或更高版本。如果您有旧版本的 Python,则需要安装 ElementTree 作为扩展,然后导入语句就不同了:import elementtree.ElementTree as ET
.
回答by wierob
Unfortunately, the XPath supported by ElementTree isn't complete. Since Python 2.6 includes an older version, finding elements by attribute (as stated here) does not work. So Python's own documentationshould be your first stop: xml.etree.ElementTree
不幸的是,ElementTree 支持的 XPath 并不完整。因为Python 2.6包括一个旧版本,发现通过属性的元素(如说这里)不起作用。所以Python 自己的文档应该是你的第一站:xml.etree.ElementTree
import xml.etree.ElementTree as ET
original = ET.parse("original.xml")
parameters = original.findall(".//Parameter")
changes = {}
# read changes
with open("changes.txt", "rb") as in_file:
for change in in_file:
change = change.rstrip() # remove line endings
name, value = change.split(":")
changes[name.strip()] = value.strip() # remove whitespaces
# find paramter element and apply changes
for parameter in parameters:
parameter_name = parameter.get("name")
if changes.has_key(parameter_name):
value = parameter.find("./Value")
value.text = changes[parameter_name]
result = parameter.find("./Result")
result.text = changes[parameter_name]
original.write("new.xml")
回答by Uche Ogbuji
Here is how you could do it using Amara
这是使用Amara 的方法
from amara import bindery
doc = bindery.parse(XML)
def cleanup_for_dict(key, value):
return key.strip(), value.strip()
params = dict(( cleanup_for_dict(*line.split(':', 1))
for line in TEXT.splitlines()))
for param in doc.ParameterData.ParameterList.Parameter:
if param.name in params:
param.Value = params[param.name]
param.Result = params[param.name]
doc.xml_write()