Python 如何将熊猫/数据帧转换为 XML？

Question

提问by user7289

is there a simple way to take a pandas/df table:

有没有一种简单的方法来获取 pandas/df 表：

field_1 field_2 field_3 field_4
cat     15,263  2.52    00:03:00
dog     1,652   3.71    00:03:47
test     312    3.27    00:03:41
book     300    3.46    00:02:40

And convert it to XML along the lines of:

并按照以下方式将其转换为 XML：

<item>
  <field name="field_1">cat</field>
  <field name="field_2">15263</field>
  <field name="filed_3">2.52</field>

...

<item>
      <field name="field_1">dog</field>

and so on...

Thanks in advance for any help.

在此先感谢您的帮助。

Answer 1

采纳答案by Viktor Kerkez

You can create a function that creates the itemnode from a row in your DataFrame:

您可以创建一个item从 DataFrame 中的一行创建节点的函数：

def func(row):
    xml = ['<item>']
    for field in row.index:
        xml.append('  <field name="{0}">{1}</field>'.format(field, row[field]))
    xml.append('</item>')
    return '\n'.join(xml)

And then apply the function along the axis=1.

然后沿axis=1.

>>> print '\n'.join(df.apply(func, axis=1))
<item>
  <field name="field_1">cat</field>
  <field name="field_2">15,263</field>
  <field name="field_3">2.52</field>
  <field name="field_4">00:03:00</field>
</item>
<item>
  <field name="field_1">dog</field>
  <field name="field_2">1,652</field>
  <field name="field_3">3.71</field>
  <field name="field_4">00:03:47</field>
</item>
...

Answer 2

回答by Andy Hayden

To expand on Viktor's excellent answer (and tweaking it slightly to work with duplicate columns), you could set this up as a to_xmlDataFrame method:

为了扩展 Viktor 的优秀答案（并稍微调整它以处理重复的列），您可以将其设置为to_xmlDataFrame 方法：

def to_xml(df, filename=None, mode='w'):
    def row_to_xml(row):
        xml = ['<item>']
        for i, col_name in enumerate(row.index):
            xml.append('  <field name="{0}">{1}</field>'.format(col_name, row.iloc[i]))
        xml.append('</item>')
        return '\n'.join(xml)
    res = '\n'.join(df.apply(row_to_xml, axis=1))

    if filename is None:
        return res
    with open(filename, mode) as f:
        f.write(res)

pd.DataFrame.to_xml = to_xml

Then you can print the xml:

然后你可以打印xml：

In [21]: print df.to_xml()
<item>
  <field name="field_1">cat</field>
  <field name="field_2">15,263</field>
  <field name="field_3">2.52</field>
  <field name="field_4">00:03:00</field>
</item>
<item>
...

or save it to a file:

或将其保存到文件：

In [22]: df.to_xml('foo.xml')

Obviously this example should be tweaked to fit your xml standard.

显然，这个例子应该进行调整以适合您的 xml 标准。

Answer 3

回答by sparkonhdfs

You can use the xml.etree.ElementTreepackage to generate a read-friendly format in a very few lines of code.

您可以使用xml.etree.ElementTree包以极少的代码行生成易于阅读的格式。

root = etree.Element('data');

for i,row in dframe.iterrows():
    item = etree.SubElement(root, 'item', attrib=row.to_dict());

etree.dump(root);

This will create a XML Tree (under root), where each row will will be of type item, and have attributes for all columns. You can create a more nested tree with columns as well by creating a subelement for each field.

这将创建一个 XML 树（在根目录下），其中每一行的类型都是item，并且具有所有列的属性。您还可以通过为每个字段创建一个子元素来创建更多嵌套的列树。

Then you can also read the xml file back in Python using the ElementTree package:

然后，您还可以使用 ElementTree 包在 Python 中读取 xml 文件：

xml.etree.ElementTree.parse('xml_file.xml');

Python 如何将熊猫/数据帧转换为 XML？

提问by user7289

采纳答案by Viktor Kerkez

回答by Andy Hayden

回答by sparkonhdfs

相关推荐

最近更新

标签

Python 如何将熊猫/数据帧转换为 XML？

提问by user7289

采纳答案by Viktor Kerkez

回答by Andy Hayden

回答by sparkonhdfs

相关推荐

Python 如何重命名熊猫系列？

在 Python 中，如何检查字符串是否不包含列表中的任何字符串？

Python 3 打印不带括号

在 Python 中将两个字典相交

相关推荐

最近更新

标签