java 哪种语言最容易、最快地处理 XML 内容?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/301493/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 11:50:47  来源:igfitidea点击:

Which language is easiest and fastest to work with XML content?

java.netpythonxmlruby

提问by Varun Mahajan

We have developers with knowledge of these languages - Ruby , Python, .Net or Java. We are developing an application which will mainly handle XML documents. Most of the work is to convert predefined XML files into database tables, providing mapping between XML documents through database, creating reports from database etc. Which language will be the easiest and fastest to work with? (It is a web-app)

我们拥有了解这些语言的开发人员 - Ruby、Python、.Net 或 Java。我们正在开发一个主要处理 XML 文档的应用程序。大部分工作是将预定义的 XML 文件转换为数据库表,通过数据库提供 XML 文档之间的映射,从数据库创建报告等。使用哪种语言最容易和最快? (这是一个网络应用程序)

采纳答案by S.Lott

A dynamic language rules for this. Why? The mappings are easy to code and change. You don't have to recompile and rebuild.

动态语言规则为此。为什么?映射易于编码和更改。您不必重新编译和重建。

Indeed, with a little cleverness, you can have your "XML XPATH to a Tag -> DB table-field" mappings as disjoint blocks of Python code that your main application imports.

实际上,稍微聪明一点,您就可以将“XML XPATH 到标记 -> DB 表字段”映射作为主要应用程序导入的 Python 代码的不相交块。

The block of Python code isyour configuration file. It's not an .inifile or a .propertiesfile that describes a configuration. It isthe configuration.

Python 代码块您的配置文件。它不是描述配置的.ini文件或.properties文件。它配置。

We use Python, xml.etree and the SQLAlchemy (to separate the SQL out of your programs) for this because we're up and running with very little effort and a great deal of flexibility.

为此,我们使用 Python、xml.etree 和 SQLAlchemy(将 SQL 从您的程序中分离出来),因为我们以很少的努力和极大的灵活性启动并运行。



source.py

源文件

"""A particular XML parser.  Formats change, so sometimes this changes, too."""

import xml.etree.ElementTree as xml

class SSXML_Source( object ):
    ns0= "urn:schemas-microsoft-com:office:spreadsheet"
    ns1= "urn:schemas-microsoft-com:office:excel"
    def __init__( self, aFileName, *sheets ):
        """Initialize a XML source.
        XXX - Create better sheet filtering here, in the constructor.
        @param aFileName: the file name.
        """
        super( SSXML_Source, self ).__init__( aFileName )
        self.log= logging.getLogger( "source.PCIX_XLS" )
        self.dom= etree.parse( aFileName ).getroot()
    def sheets( self ):
        for wb in self.dom.getiterator("{%s}Workbook" % ( self.ns0, ) ):
            for ws in wb.getiterator( "{%s}Worksheet" % ( self.ns0, ) ):
                yield ws
    def rows( self ):
        for s in self.sheets():
            print s.attrib["{%s}Name" % ( self.ns0, ) ]
            for t in s.getiterator( "{%s}Table" % ( self.ns0, ) ):
                for r in t.getiterator( "{%s}Row" % ( self.ns0, ) ):
                    # The XML may not be really useful.
                    # In some cases, you may have to convert to something useful
                    yield r

model.py

模型.py

"""This is your target object.  
It's part of the problem domain; it rarely changes.
"""
class MyTargetObject( object ):
    def __init__( self ):
        self.someAttr= ""
        self.anotherAttr= ""
        self.this= 0
        self.that= 3.14159
    def aMethod( self ):
        """etc."""
        pass

builder_today.pyOne of many mapping configurations

builder_today.py众多映射配置之一

"""One of many builders.  This changes all the time to fit
specific needs and situations.  The goal is to keep this
short and to-the-point so that it has the mapping and nothing
but the mapping.
"""

import model

class MyTargetBuilder( object ):
    def makeFromXML( self, element ):
        result= model.MyTargetObject()
        result.someAttr= element.findtext( "Some" )
        result.anotherAttr= element.findtext( "Another" )
        result.this= int( element.findtext( "This" ) )
        result.that= float( element.findtext( "that" ) )
        return result

loader.py

加载器.py

"""An application that maps from XML to the domain object
using a configurable "builder".
"""
import model
import source
import builder_1
import builder_2
import builder_today

# Configure this:  pick a builder is appropriate for the data:
b= builder_today.MyTargetBuilder()

s= source.SSXML_Source( sys.argv[1] )
for r in s.rows():
    data= b.makeFromXML( r )
    # ... persist data with a DB save or file write


To make changes, you can correct a builder or create a new builder. You adjust the loader source to identify which builder will be used. You can, without too much trouble, make the selection of builder a command-line parameter. Dynamic imports in dynamic languages seem like overkill to me, but they are handy.

要进行更改,您可以更正构建器或创建新构建器。您调整加载程序源以确定将使用哪个构建器。您可以轻松地将 builder 的选择设为命令行参数。动态语言中的动态导入对我来说似乎有点矫枉过正,但它们很方便。

回答by AJ.

XSLT

XSLT

I suggest using XSLT templatesto transform the XML into INSERT statements (or whatever you need), as required.
You should be able to invoke XSLT from any of the languages you mention.

我建议根据需要使用XSLT 模板将 XML 转换为 INSERT 语句(或任何您需要的语句)。
您应该能够从您提到的任何语言调用 XSLT。

This will result in a lot less code than doing it the long way round.

这将导致比长期这样做少得多的代码。

回答by Chris Fulstow

In .NET, C# 3.0 and VB9 provide excellent support for working with XML using LINQ to XML:

在 .NET 中,C# 3.0 和 VB9 为使用 LINQ to XML 处理 XML 提供了出色的支持:

LINQ to XML Overview

LINQ to XML 概述

回答by mwilliams

I'll toss in a suggestion for Hpricot, a popular Ruby XML parser (although there are many similar options).

我将提出对Hpricot的建议,是一种流行的 Ruby XML 解析器(尽管有许多类似的选项)。

Example:

例子:

Given the following XML:

给定以下 XML:

<Export>
  <Product>
    <SKU>403276</SKU>
    <ItemName>Trivet</ItemName>
    <CollectionNo>0</CollectionNo>
    <Pages>0</Pages>
  </Product>
</Export>

You parse simply by:

您只需通过以下方式解析:

FIELDS = %w[SKU ItemName CollectionNo Pages]

doc = Hpricot.parse(File.read("my.xml")) 
(doc/:product).each do |xml_product|
  product = Product.new
  for field in FIELDS
    product[field] = (xml_product/field.intern).first.innerHTML
  end
  product.save
end

It sounds like your application would be very fit for a Railsapplication, You could quickly prototype what you need, you've got direct interaction with your database of choice and you can output the data however you need to.

听起来您的应用程序非常适合Rails应用程序,您可以快速制作您需要的原型,您可以直接与您选择的数据库交互,您可以根据需要输出数据。

Here's another great resource page for parsing XML with Hpricotthat might help as well as the documentation.

这是使用 Hpricot 解析 XML的另一个很棒的资源页面,它可能与文档一样有帮助。

回答by rich

For quick turnaround I've found Groovyvery useful.

为了快速周转,我发现Groovy非常有用。

回答by Marc Seeger

An interesting solution could be Ruby. Simply use XML->Object mappers and then use an object-relational-mapper (ORM) to put it inside a database. I had to do a short talk on XML Mapping with ruby, you could look at the slides and see what you like best: http://www.marc-seeger.de/2008/11/25/ruby-xml-mapping/

一个有趣的解决方案可能是 Ruby。只需使用 XML-> 对象映射器,然后使用对象关系映射器 (ORM) 将其放入数据库中。我不得不用 ruby​​ 做一个关于 XML 映射的简短演讲,你可以看看幻灯片,看看你最喜欢什么:http: //www.marc-seeger.de/2008/11/25/ruby-xml-mapping/

As for the ORM: Active Record or Datamapper should be the way to go

至于 ORM:Active Record 或 Datamapper 应该是要走的路

回答by Marc Seeger

ECMAScript handles XML pretty nicely using E4X ("ECMAScript for XML"). This can be seen in Adobe's latest version of ActionScript, version 3. I believe JavaScript 2 (to be released with Firefox 4, I think) will support E4X as well.

ECMAScript 使用 E4X(“ECMAScript for XML”)很好地处理 XML。这可以在 Adob​​e 最新版本的 ActionScript 版本 3 中看到。我相信 JavaScript 2(我认为将与 Firefox 4 一起发布)也将支持 E4X。

Not sure about the support of the standalone JavaScript interpreters (i.e. Rhino, et al) of this, which is what matters most to you I suppose... But if it looks good to you, you can always look up their support for it (and report back to us :-)).

不确定独立 JavaScript 解释器(即 Rhino 等)的支持,我想这对你来说最重要......并向我们​​报告:-))。

See http://en.wikipedia.org/wiki/E4X#Examplefor a simple example.

有关简单示例,请参阅http://en.wikipedia.org/wiki/E4X#Example

回答by StevenMcD

either C# or VB.Net using LiNQ to XML. LiNQ to XML is very very powerful and easy to implement

使用 LiNQ to XML 的 C# 或 VB.Net。LiNQ to XML 非常非常强大且易于实现

回答by suraj vijayakumar

If you are well versed in Java, you can try out VTDXML Parser for parsing large volumes of XML data.

如果您精通 Java,则可以尝试使用VTDXML Parser 来解析大量 XML 数据。