C# Open XML 2.0 NumberFormatId 范围
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11781210/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C# Open XML 2.0 NumberFormatId range
提问by maguy
Working with Open XML 2.0 using c# to parse large excel files. Issue I'm running into is the cell I'm parsing does not have a DataType I then check the NumberFormatId to determine if it is decimal, number or date. I'm looking for the exact NumberFormatId range for numbers/decimals vs dates. They seem to be all over the place some numbers/decimals have formats of 189,212,214,305 and dates having values of 185, 194, 278 etc. Does anyone know if the specification defines these ranges?
使用 C# 使用 Open XML 2.0 来解析大型 excel 文件。我遇到的问题是我正在解析的单元格没有数据类型,然后我检查 NumberFormatId 以确定它是十进制、数字还是日期。我正在寻找数字/小数与日期的确切 NumberFormatId 范围。它们似乎无处不在,一些数字/小数的格式为 189,212,214,305,日期的值为 185、194、278 等。有谁知道规范是否定义了这些范围?
Edited - More Information
已编辑 - 更多信息
Below is an example of the number format of 194 from the style.xml file inside the xl folder.
下面是 xl 文件夹中 style.xml 文件中的数字格式 194 的示例。
The excel sheets are from different regions of the world so I'm thinking the number formats are different, but do they overlap? Will numFmtId 194 be something other than a date on different culture settings?
excel表格来自世界不同地区,所以我认为数字格式不同,但它们是否重叠?numFmtId 194 会不会是不同文化背景下的日期?
Below is how I'm converting c.CellValues like "40574" to dates, but the issue is how do I know if "40574" is a date and not a number?
下面是我如何将像“40574”这样的 c.CellValues 转换为日期,但问题是我如何知道“40574”是日期而不是数字?
DateTime.FromOADate(Convert.ToDouble(c.CellValue.Text));
Currently I'm doing this by checking if there is no DataType than check the CellFormat but there are issues when some of the NumberFormatId are not in my check.
目前,我通过检查是否没有数据类型而不是检查 CellFormat 来执行此操作,但是当某些 NumberFormatId 不在我的检查中时会出现问题。
private Object FormatCellValue(Cell c, SharedStringTable ssTable, CellFormats cellFormats)
{
if (c.CellValue != null)
{
// If there is no data type, this must be a string that has been formatted as a number
if (c.DataType == null)
{
CellFormat cf;
if (c.StyleIndex == null)
{
cf = cellFormats.Descendants<CellFormat>().ElementAt<CellFormat>(0);
}
else
{
cf = cellFormats.Descendants<CellFormat>().ElementAt<CellFormat>(Convert.ToInt32(c.StyleIndex.Value));
}
if ((cf.NumberFormatId >= 14 && cf.NumberFormatId <= 22) ||
(cf.NumberFormatId >= 165 && cf.NumberFormatId <= 180) ||
cf.NumberFormatId == 278 || cf.NumberFormatId == 185 || cf.NumberFormatId == 196 ||
cf.NumberFormatId == 217 || cf.NumberFormatId == 326) // Dates
{
try
{
DateTime dt;
dt = DateTime.FromOADate(Convert.ToDouble(c.CellValue.Text));
...CODE CONTINUES
Edit
编辑
In my updated post I forgot to post the value I found in the style.xml file:
在我更新的帖子中,我忘记发布在 style.xml 文件中找到的值:
<numFmt numFmtId="323" formatCode="mmm/yy;@"/>
So with this my question would be how do I get the formatCode and parse it to determine if it is a date?
因此,我的问题是如何获取 formatCode 并解析它以确定它是否是日期?
Below is the output from the immediate debug window of the numberformat 323
下面是 numberformat 323 的立即调试窗口的输出
{DocumentFormat.OpenXml.Spreadsheet.CellFormat}
base {DocumentFormat.OpenXml.OpenXmlCompositeElement}: {DocumentFormat.OpenXml.Spreadsheet.CellFormat}
Alignment: {DocumentFormat.OpenXml.Spreadsheet.Alignment}
ApplyAlignment: "1"
ApplyBorder: "1"
ApplyFill: "1"
ApplyFont: "1"
ApplyNumberFormat: "1"
ApplyProtection: "1"
BorderId: "64"
ExtensionList: null
FillId: "0"
FontId: "83"
FormatId: "37992"
LocalName: "xf"
NumberFormatId: "323"
PivotButton: null
Protection: {DocumentFormat.OpenXml.Spreadsheet.Protection}
QuotePrefix: "1"
采纳答案by Lukasz M
Lists of formats ID values
格式 ID 值列表
Below is the list of format options (source)
以下是格式选项列表(来源)
ID Format Code
0 General
1 0
2 0.00
3 #,##0
4 #,##0.00
9 0%
10 0.00%
11 0.00E+00
12 # ?/?
13 # ??/??
14 d/m/yyyy
15 d-mmm-yy
16 d-mmm
17 mmm-yy
18 h:mm tt
19 h:mm:ss tt
20 H:mm
21 H:mm:ss
22 m/d/yyyy H:mm
37 #,##0 ;(#,##0)
38 #,##0 ;[Red](#,##0)
39 #,##0.00;(#,##0.00)
40 #,##0.00;[Red](#,##0.00)
45 mm:ss
46 [h]:mm:ss
47 mmss.0
48 ##0.0E+0
49 @
Hower, those list specify only several formats. According to this post: Reading dates from OpenXml Excel files, format with ID value less than 164 are built in. You can also find a longer list of formats there.
但是,这些列表仅指定了几种格式。根据这篇文章:从 OpenXml Excel 文件中读取日期,内置ID 值小于 164 的格式。您还可以在那里找到更长的格式列表。
Checking formats ID values in xlsx file
检查 xlsx 文件中的格式 ID 值
For formats with greater ID values, you can find their definitions inside the file itself. In order to see them, you should open it with a zip archive browser and find styles.xmlfile in xldirectory. Alternatively open this xlsx file with Open XML SDK 2.0 Productivity Toolsand navigate to that file's /xl/styles.xml/x:StyleSheetnode.
对于具有更大 ID 值的格式,您可以在文件本身中找到它们的定义。为了查看它们,您应该使用 zip 存档浏览器打开它并在xl目录中找到styles.xml文件。或者,使用Open XML SDK 2.0 Productivity Tools打开此 xlsx 文件并导航到该文件的/xl/styles.xml/x:StyleSheet节点。
In that section, you should be able to see formats defined in your document along with ID values assigned to them. The part with formats should look similar to this:
在该部分中,您应该能够看到文档中定义的格式以及分配给它们的 ID 值。具有格式的部分应与此类似:
...
<x:numFmts count="1">
<x:numFmt numFmtId="166" formatCode="yy/mm/dd;@" />
</x:numFmts>
...
Looking at formats saved here, it seems that id vlaues can be specific to a xlsxfile, so probably the same ID value can be used to define different formats in two different xlsx files. However, for built-int formats they're predefined, so should be the same in all the files.
查看此处保存的格式,似乎 id vlaues 可以特定于xlsx文件,因此可能可以使用相同的 ID 值在两个不同的 xlsx 文件中定义不同的格式。但是,对于内置格式,它们是预定义的,因此在所有文件中都应该相同。
If you need any help with finding this formats in your file or additional information, let me know.
如果您在文件中查找此格式或其他信息时需要任何帮助,请告诉我。
EDIT
编辑
You can also find some more information about number formats in this document: http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.numberingformat.aspx.
您还可以在此文档中找到有关数字格式的更多信息:http: //msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.numberingformat.aspx。
EDIT II
编辑二
You can use this code to get a dictionary containing all the formats defined within the xlsxfile:
您可以使用此代码获取包含在xlsx文件中定义的所有格式的字典:
private Dictionary<uint, String> BuildFormatMappingsFromXlsx(String fileName)
{
Dictionary<uint, String> formatMappings = new Dictionary<uint, String>();
using (SpreadsheetDocument document = SpreadsheetDocument.Open(fileName, true))
{
var stylePart = document.WorkbookPart.WorkbookStylesPart;
var numFormatsParentNodes = stylePart.Stylesheet.ChildElements.OfType<NumberingFormats>();
foreach (var numFormatParentNode in numFormatsParentNodes)
{
var formatNodes = numFormatParentNode.ChildElements.OfType<NumberingFormat>();
foreach (var formatNode in formatNodes)
{
formatMappings.Add(formatNode.NumberFormatId.Value, formatNode.FormatCode);
}
}
}
return formatMappings;
}
If you want to check if any of those is a date, I suppose a simple way would be verifying if format code (value in the dictionary created by the method I've posted) contains mmand yysubstrings.
如果您想检查其中任何一个是否是日期,我想一种简单的方法是验证格式代码(由我发布的方法创建的字典中的值)是否包含mm和yy子字符串。

