何时以及为何 XML 优于 CSV?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1820129/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When and Why is XML preferable to CSV?
提问by Nick
sometimes it feels like XML has been used just because it was fashionable.
有时感觉就像使用 XML 只是因为它很流行。
回答by Robert Koritnik
Some strengths:
一些优势:
- You can validate XML data against XSD
- You can easily provide contracts(as XSD) to other parties that should either create/consume XML data, without literally describing them
- You can have one to many relations in multi-levels in XML data representation
- XML is arguably more readable than CSV
- XML is natively supported by the .net framework
- 您可以根据 XSD 验证 XML 数据
- 您可以轻松地将合同(作为 XSD)提供给应该创建/使用 XML 数据的其他方,而无需逐字描述它们
- 您可以在 XML 数据表示中拥有多级的一对多关系
- XML 可以说比 CSV 更具可读性
- .net 框架原生支持 XML
To name a few from the top of my head.
仅举几例。
回答by dnagirl
.csv files are good when your data is strictly tabular and you know its structure. As soon as you start having relationships between different levels of your data, xml tends to work better because relationships can be made obvious (even without schemas) just by nesting.
当您的数据是严格的表格并且您知道其结构时,.csv 文件是很好的。一旦您开始在数据的不同级别之间建立关系,xml 往往会更好地工作,因为仅通过嵌套就可以使关系变得明显(即使没有模式)。
回答by rborchert
XML has become the default for its many benefits that lots of other people have already mentioned. So the question really becomes "When and Why is CSV preferable to XML?".
XML 已成为许多其他人已经提到的许多优点的默认设置。所以问题真的变成了“什么时候以及为什么 CSV 比 XML 更可取?”。
I feel CSV is preferable to XML when: - you are loading simple tabular data - you are in control of both the generation and consumption of the data file - the dataset is large
在以下情况下,我觉得 CSV 比 XML 更可取: - 您正在加载简单的表格数据 - 您可以控制数据文件的生成和使用 - 数据集很大
CSV is perfectly usable if the first 2 points are true, and has a performance benefit that becomes more significant the larger the dataset is.
如果前 2 点为真,则 CSV 完全可用,并且具有随着数据集越大而变得越显着的性能优势。
I did a quick test loading ~8000 records each with 6 text fields. Loading and parsing the XML took ~8 seconds. Loading the CSV took less than 1 second.
我做了一个快速测试,加载了大约 8000 条记录,每条记录有 6 个文本字段。加载和解析 XML 需要大约 8 秒。加载 CSV 的时间不到 1 秒。
The overhead of XML is worth it in a lot of cases, but when the stars align, CSV makes more sense.
在很多情况下,XML 的开销是值得的,但是当星星对齐时,CSV 更有意义。
回答by James Goodwin
CSV is useful when you just have a series of a values that relate to some piece of information and you know you will always store values for each field.
当您只有一系列与某些信息相关的值并且您知道您将始终为每个字段存储值时,CSV 很有用。
XML has the benefit of having self-describing data (tags) and having hierarchy - which gives you a lot more flexibility in the way that you store the data.
XML 具有自描述数据(标签)和层次结构的优点——这为您存储数据的方式提供了更大的灵活性。
回答by dcp
You can have a much more complex hierarchy, etc. and structure with XML vs. CSV. It offers a lot more flexibility.
您可以使用 XML 与 CSV 拥有更复杂的层次结构等和结构。它提供了更多的灵活性。
回答by Tom
I found an interesting performance test on the net. God example of drawbacks of XML when the features of XML is not needed.
我在网上发现了一个有趣的性能测试。当不需要 XML 的特性时,XML 的缺点的上帝例子。
"I tried Steven's experiment from a different angle. I filled an Excel XP spreadsheet with a single-digit number, saved it in both XML and in a comma-delimited text file (CSV). I then compressed both with WinZip and then opened both with Excel. Here's what I found:
“我从不同的角度尝试了 Steven 的实验。我用一位数填充了 Excel XP 电子表格,将其保存在 XML 和逗号分隔的文本文件 (CSV) 中。然后我用 WinZip 压缩了两者,然后打开了两者使用 Excel。这是我发现的:
The XML file was 840MB, the CSV 34MB -- a 2,500% difference Compressed, the XML file was 2.5MB, the CSV 0.00015MB (150KB) -- a 1,670% difference.
XML 文件为 840MB,CSV 为 34MB - 压缩后的差异为 2,500%,XML 文件为 2.5MB,CSV 为 0.00015MB (150KB) - 差异为 1,670%。
Equally dramatic is the time it took to uncompress and render the files as an Excel spreadsheet: It took about 20 minutes with the XML file; the CSV took 1 minute -- a 2,000% difference."
同样引人注目的是解压缩文件并将其呈现为 Excel 电子表格所花费的时间:使用 XML 文件大约需要 20 分钟;CSV 需要 1 分钟——2,000% 的差异。”
回答by Funklord
XML is preferrable over CSV when the data is unstructured (unknown schema) and will be read by a human.
当数据是非结构化的(未知模式)并且将被人类读取时,XML 比 CSV 更可取。
Arguably, unless the data contains predominantly text, CSV is also meant for human consumption.
可以说,除非数据主要包含文本,否则 CSV 也适用于人类消费。
Also relevant, is if your data is 2 or 3 dimensional. CSV is most suitable for 2 dimensional text, and due to its' verbosity, XML works well with 3 dimensional data.
同样相关的是,如果您的数据是 2 维或 3 维。CSV 最适合 2 维文本,并且由于其冗长,XML 可以很好地处理 3 维数据。
The whole "standardness" of XML is hyperbole, and should not be taken literally. XML does have huge technical issues and many of the solutions aren't particularly elegant, or in many cases useful:
XML 的整个“标准”是夸张的,不应按字面意思理解。XML 确实存在巨大的技术问题,许多解决方案并不是特别优雅,或者在许多情况下是有用的:
- It uses text to specify its own text-encoding (chicken and egg?)
- None of the more common schema languages for XML work particularly well.
- The ancient and commonplace way of creating mark-up languages using
<tags>is not particularly helpful as a standard. - XML tries to retroactively shoehorn more powerful mark-up languages such as the SGML based ones, into itself, creating a mess of incompatible legacy.
- It still remains to be determined whether or not XML text escape sequences can work for anything but the most simple cases (ie. friendly data).
- 它使用文本来指定自己的文本编码(鸡和蛋?)
- 没有一种更常见的 XML 模式语言工作得特别好。
- 使用创建标记语言的古老而普通的方法
<tags>作为标准并不是特别有用。 - XML 试图追溯性地将更强大的标记语言(例如基于 SGML 的标记语言)硬塞到自身中,从而造成一堆不兼容的遗留问题。
- 除了最简单的情况(即友好数据)之外,XML 文本转义序列是否适用于任何其他情况仍有待确定。
To be clear, XML is probably the incorrect choice for 90% of the data interchange it is currently being used for, since those uses break some or all of the above assumptions.
需要明确的是,对于目前使用的 90% 的数据交换而言,XML 可能是不正确的选择,因为这些用途破坏了上述部分或全部假设。
回答by Funklord
Of course it is fashionable and buzz-worthy sometimes. It all depends on your application. I prefer config files in XML because they are easy to parse. Whereas, I use CSV files for DataGridView or database dumps.
当然,它有时很时尚,也很受欢迎。这一切都取决于您的应用程序。我更喜欢 XML 中的配置文件,因为它们易于解析。而我将 CSV 文件用于 DataGridView 或数据库转储。
This Daily WTF : XML vs CSV The Choice is Obviouswill help you make your decision ;)
This Daily WTF : XML vs CSV The Choice is Obvious将帮助您做出决定;)
回答by Greg
In addition to the other answers, XML allows you to specify which character set the document is in.
除了其他答案之外,XML 还允许您指定文档所在的字符集。
回答by Pekka
I have found of the greatest advantages of XML to be the parsing functionality and the strict validation that comes out-of-the-box with most XML libraries. The insistence on well-formedness and easy-to-understand error message (xyz not closed in line x, column y) are a real help compared to hunting broken values, or unknown behaviour, because of an error in the CSV file.
我发现 XML 的最大优点是解析功能和大多数 XML 库开箱即用的严格验证。与寻找损坏的值或未知行为相比,由于 CSV 文件中的错误,坚持格式良好和易于理解的错误消息(xyz 未在 x 行,y 列中关闭)是真正的帮助。

