SAX 与 XmlTextReader - C# 中的 SAX

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/127869/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 14:47:27  来源:igfitidea点击:

SAX vs XmlTextReader - SAX in C#

提问by cgreeno

I am attempting to read a large XML document and I wanted to do it in chunks vs XmlDocument's way of reading the entire file into memory. I know I can use XmlTextReaderto do this but I was wondering if anyone has used SAX for .NET? I know Java developers swear by it and I was wondering if it is worth giving it a try and if so what are the benefits in using it. I am looking for specifics.

我正在尝试读取一个大型 XML 文档,我想以块 vsXmlDocument将整个文件读入内存的方式进行读取。我知道我可以XmlTextReader这样做,但我想知道是否有人将 SAX 用于 .NET?我知道 Java 开发人员对它发誓,我想知道是否值得尝试一下,如果值得,使用它有什么好处。我正在寻找细节。

采纳答案by Craig Trader

If you're talking about SAX for .NET, the project doesn't appear to be maintained. The last release was more than 2 years ago. Maybe they got it perfect on the last release, but I wouldn't bet on it. The author, Karl Waclawek, seems to have disappeared off the net.

如果您谈论的是用于 .NET 的 SAX,则该项目似乎没有得到维护。上一次发布是在 2 年前。也许他们在上一个版本中做得很完美,但我不会打赌。作者 Karl Waclawek 似乎已经从网络上消失了。

As for SAX under Java? You bet, it's great. Unfortunately, SAX was never developed as a standard, so all of the non-Java ports have been adapting a Java API for their own needs. While DOM is a pretty lousy API, it has the advantage of having been designed for multiple languages and environments, so it's easy to implement in Java, C#, JavaScript, C, et al.

至于Java下的SAX?你打赌,这很棒。不幸的是,SAX 从未被开发为标准,因此所有非 Java 端口都在调整 Java API 以满足他们自己的需要。虽然 DOM 是一个非常糟糕的 API,但它的优势在于它是为多种语言和环境设计的,因此很容易用 Java、C#、JavaScript、C 等实现。

回答by GregK

I believe there are no benefits using SAX at least due two reasons:

我相信使用 SAX 没有任何好处,至少有两个原因:

  1. SAX is a "push" model while XmlReader is a pull parser that has a number of benefits.
  2. Being dependent on a 3rd-party library rather than using a standard .NET API.
  1. SAX 是一个“推”模型,而 XmlReader 是一个拉解析器,它具有许多优点
  2. 依赖于第 3 方库而不是使用标准的 .NET API。

回答by EnocNRoll - AnandaGopal Pardue

If you just want to get the job done quickly, the XmlTextReader exists for that purpose (in .NET).

如果您只想快速完成工作,XmlTextReader 就是为此目的而存在的(在 .NET 中)。

If you want to learn a de facto standard (and available in may other programming languages) that is stable and which will force you to code very efficiently and elegantly, but which is also extremely flexible, then look into SAX. However, don't waste your time unless you're going to be creating highly esoteric XML parsers. Instead, look for parsers that next generation parsers (like XmlTextReader) for your particular platform.

如果您想学习一个稳定的事实标准(并且可以在其他编程语言中使用),这将迫使您非常高效和优雅地编码,但它也非常灵活,那么请研究 SAX。但是,除非您要创建非常深奥的 XML 解析器,否则不要浪费时间。相反,为您的特定平台寻找下一代解析器(如 XmlTextReader)的解析器。

SAX Resources
SAX was originally written for Java, and you can find the original open source project, which has been stable for several years, here: http://sax.sourceforge.net/

SAX 资源
SAX 最初是为 Java 编写的,您可以在此处找到稳定多年的原始开源项目:http: //sax.sourceforge.net/

There is a C# port of the same project here (with HTML docs as part of the source download); it is also stable: http://saxdotnet.sourceforge.net/

此处有同一个项目的 C# 端口(源下载中包含 HTML 文档);它也很稳定:http: //saxdotnet.sourceforge.net/

If you do not like the C# implementation, you could always resort to referencing COM DLLs via COMInterop using MSXML3 or later: http://msdn.microsoft.com/en-us/library/ms994343.aspx

如果您不喜欢 C# 实现,您始终可以使用 MSXML3 或更高版本通过 COMInterop 引用 COM DLL:http: //msdn.microsoft.com/en-us/library/ms994343.aspx

Articles that come from the Java world but which probably illustrate the concepts you need to be successful with this approach (there may also be downloadable Java source code that could prove useful and may be easy enough to convert to C#):

来自 Java 世界的文章,但它们可能说明了使用这种方法取得成功所需的概念(可能还有可下载的 Java 源代码,这些代码可能证明很有用,并且可能很容易转换为 C#):

It will be a cumbersome implementation. I have only used SAX back in my pre-.NET days, but it requires some pretty advanced coding techniques. At this point, it's just not worth the trouble.

这将是一个繁琐的实施。我在 .NET 之前的日子里只使用过 SAX,但它需要一些非常先进的编码技术。在这一点上,这不值得麻烦。

Interesting Concept for a Hybrid Parser
This thread describes a hybrid parser that uses the .NET XmlTextReader to implement a parser that provides a combination of DOM and SAX benefits...
http://bytes.com/groups/net-xml/178403-xmltextreader-versus-dom

混合解析器的有趣概念
该线程描述了一个混合解析器,它使用 .NET XmlTextReader 来实现一个解析器,该解析器提供了 DOM 和 SAX 的组合优势......
http://bytes.com/groups/net-xml/178403- xmltextreader 与 dom

回答by Brett Ryan

Personally, I much prefer the SAX model as the XmlReader has some really annoying traps that can cause bugs in your code that might cause your code to skip elements. Most code would be structured around a while(rdr.Read()) model, but if you have any "ReadString" or "ReadInnerXml()" within that loop you will find yourself skipping elements on the next iteration.

就我个人而言,我更喜欢 SAX 模型,因为 XmlReader 有一些非常烦人的陷阱,这些陷阱可能会导致您的代码中出现错误,从而可能导致您的代码跳过元素。大多数代码将围绕 while(rdr.Read()) 模型构建,但如果您在该循环中有任何“ReadString”或“ReadInnerXml()”,您将发现自己在下一次迭代中跳过元素。

As SAX is event based this will never hapen as you can not perform any operations that would cause your parser to seek-ahead.

由于 SAX 是基于事件的,这永远不会发生,因为您无法执行任何会导致解析器提前搜索的操作。

My personal feeling is that Microsoft have invented the notion that the XmlReader is better with the explanation of the push/pull model, but I don't really buy it. So Microsoft think that you don't need to create a state-machine with XmlReader, that doesn't make sense to me, but anyway, it's just my opinion.

我的个人感觉是微软发明了XmlReader 更好的概念,并解释了推/拉模型,但我并不真正购买它。所以微软认为你不需要用 XmlReader 创建状态机,这对我来说没有意义,但无论如何,这只是我的意见。