Java 使用 HSSF 从 Excel 读取字符串值,但它是双倍的

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1411157/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 11:47:55  来源:igfitidea点击:

Reading string value from Excel with HSSF but it's double

javaexcelapache-poipoi-hssf

提问by egaga

I'm using HSSF-POI for reading excel data. The problem is I have values in a cell that look like a number but really are strings. If I look at the format cell in Excel, it says the type is "text". Still the HSSF Cell thinks it's numeric. How can I get the value as a string?

我正在使用 HSSF-POI 来读取 excel 数据。问题是我在一个单元格中的值看起来像一个数字,但实际上是字符串。如果我查看 Excel 中的格式单元格,它会显示类型为“文本”。HSSF Cell 仍然认为它是数字。如何以字符串形式获取值?

If I try to use cell.getRichStringValue, I get exception; if cell.toString, it's not the exact same value as in Excel sheet.

如果我尝试使用cell.getRichStringValue,则会出现异常;if cell.toString,它与 Excel 工作表中的值不完全相同。

Edit: until this gets resolved, I'll use

编辑:直到这得到解决,我会使用

new BigDecimal(cell.getNumericCellValue()).toString()

采纳答案by Vladimir Dyuzhev

You mean HSSF-POI says

你的意思是 HSSF-POI 说

cell.getCellType() == Cell.CELL_TYPE_NUMERIC

cell.getCellType() == Cell.CELL_TYPE_NUMERIC

NOT

不是

Cell.CELL_TYPE_STRINGas it should be?

Cell.CELL_TYPE_STRING应该怎样?

I would think it's a bug in POI, but every cell contains a Variant, and Variant has a type. It's kind of hard to make a bug there, so instead I think Excel uses some extra data or heuristic to report the field as text. Usual MS way, alas.

我认为这是 POI 中的一个错误,但每个单元格都包含一个 Variant,而 Variant 有一个类型。在那里制造错误有点困难,所以我认为 Excel 使用一些额外的数据或启发式将字段报告为文本。通常的MS方式,唉。

P.S. You cannot use any getString()on a Variant containing numeric, as the binary representation of the Variant data depends on it's type, and trying to get a string from what is actually a number would result in garbage -- hence the exception.

PS 您不能getString()在包含数字的 Variant 上使用 any ,因为 Variant 数据的二进制表示取决于它的类型,并且尝试从实际数字中获取字符串会导致垃圾 - 因此是例外。

回答by user151019

Excel will convert anything that looks like a number or date or time from a string. See MS Knowledge base article, which basically suggests to enter the number with an extra character that makes it a string.

Excel 将转换任何看起来像数字、日期或时间的字符串。请参阅MS 知识库文章,它基本上建议输入带有额外字符的数字,使其成为字符串。

回答by ZZ Coder

You are probably dealing with an Excel problem. When you create the spreadsheet, the default cell type is Generic. With this type, Excel guesses the type based on the input and this type is saved with each cell.

您可能正在处理 Excel 问题。创建电子表格时,默认单元格类型为通用。对于这种类型,Excel 根据输入猜测类型,并且这种类型与每个单元格一起保存。

When you later change the cell format to Text, you are just changing the default. Excel doesn't change every cell's type automatically. I haven't found a way to do this automatically.

当您稍后将单元格格式更改为文本时,您只是更改了默认值。Excel 不会自动更改每个单元格的类型。我还没有找到自动执行此操作的方法。

To confirm this, you can go to Excel and retype one of the numbers and see if it's text in HSSF.

要确认这一点,您可以转到 Excel 并重新键入其中一个数字,看看它是否是 HSSF 中的文本。

You can also look at the real cell type by using this function,

您还可以使用此功能查看真实的细胞类型,

  @Cell("type", A1)

A1 is the cell for the number. It shows "l" for text, "v" for numbers.

A1 是数字的单元格。它显示“l”代表文本,“v”代表数字。

回答by jt.

If the documents you are parsing are always in a specific layout, you can change the cell type to "string" on the fly and then retrieve the value. For example, if column 2 should always be string data, set its cell type to string and then read it with the string-type get methods.

如果您正在解析的文档始终处于特定布局中,您可以即时将单元格类型更改为“字符串”,然后检索该值。例如,如果第 2 列应始终为字符串数据,请将其单元格类型设置为字符串,然后使用字符串类型的 get 方法读取它。

cell.setCellType(Cell.CELL_TYPE_STRING);

In my testing, changing the cell type did not modify the contents of the cell, but did allow it to be retrieved with either of the following approaches:

在我的测试中,更改单元格类型不会修改单元格的内容,但允许使用以下任一方法检索它:

cell.getStringCellValue();

cell.getRichStringCellValue().getString();

Without an example of a value that is not converting properly, it is difficult to know if this will behave any differently than the cell.toString() approach you described in the description.

如果没有未正确转换的值的示例,则很难知道这是否与您在说明中描述的 cell.toString() 方法有任何不同。

回答by Turismo

The problem with Excel is that the default format is generic. With this format Excel stores numbers entered in the cell as numeric. You have to change the format to text beforeentering the values. Reentering the values after changing the format will also work.
That will lead to little green triangles in the left upper corner of the cells if the content looks like a number to Excel. If this is the case the value is really stored as text.

Excel 的问题在于默认格式是通用的。使用这种格式,Excel 将在单元格中输入的数字存储为数字。您必须输入值之前将格式更改为文本。更改格式后重新输入值也将起作用。
如果内容在 Excel 中看起来像一个数字,这将导致单元格左上角出现小绿色三角形。如果是这种情况,则该值实际上存储为文本。

With new BigDecimal(cell.getNumericCellValue()).toString() you will still have a lot of problems. For example if you have identifying numbers (e.g. part numbers or classification numbers) you probably have cases that have leading zeros which will be a problem with the getNumericCellValue() approach.

使用 new BigDecimal(cell.getNumericCellValue()).toString() 你仍然会遇到很多问题。例如,如果您有标识号(例如零件号或分类号),您可能会有带有前导零的案例,这将成为 getNumericCellValue() 方法的问题。

I try to thoroughly explain how to correctly create the Excel to the party creating the files I have to handle with POI. If the files are uploaded by end users I even have created a validation program to check for expected cell types if I know the columns in advance. As a by-product you can also check various other things of the supplied files (e.g. are the right columns provided or mandatory values).

我尝试向创建我必须处理 POI 的文件的一方彻底解释如何正确创建 Excel。如果文件是由最终用户上传的,我什至创建了一个验证程序来检查预期的单元格类型(如果我提前知道列)。作为副产品,您还可以检查所提供文件的各种其他内容(例如,是否提供了正确的列或强制值)。

回答by John Machin

"The problem is I have values in a cell that look like a number" => look like number when viewed in Excel?

“问题是我在单元格中的值看起来像数字”=> 在 Excel 中查看时看起来像数字?

"but really are strings" => what does that mean? How do you KNOW that they really are strings?

“但真的是字符串”=> 这是什么意思?你怎么知道它们真的是字符串?

"If I look at the format cell" => what's "the format cell"???

“如果我查看格式单元格”=> 什么是“格式单元格”???

'... in Excel, it says the type is "text"' => Please explain.

'... 在 Excel 中,它说类型是“文本”' => 请解释。

"Still the HSSF Cell thinks it's numeric." => do you mean that the_cell.getCellType() returns Cell.CELL_TYPE_NUMERIC?

“仍然 HSSF 细胞认为它是数字。” => 你的意思是 the_cell.getCellType() 返回 Cell.CELL_TYPE_NUMERIC 吗?

"How can I get the value as a string?" => if it's NUMERIC, get the numeric value using the_cell.getNumericCellValue(), then format it as a string any way you want to.

“我怎样才能得到一个字符串的值?” =>如果是 NUMERIC,则使用 the_cell.getNumericCellValue() 获取数值,然后以任何您想要的方式将其格式化为字符串。

"If I try to use cell.getRichStringValue, I get exception;" => so it's not a string.

“如果我尝试使用 cell.getRichStringValue,则会出现异常;” => 所以它不是一个字符串。

"if cell.toString, it's not the exact same value as in Excel sheet." => so cell.toString() doesn't format it the way that Excel formats it.

“如果是 cell.toString,则它与 Excel 工作表中的值不完全相同。” => 所以 cell.toString() 不会像 Excel 格式化它那样格式化它。

Whatever heuristic Excel uses to determine type is irrelevant to you. It's the RESULT of that decision as stored in the file and revealed by getCellType() that matters.

Excel 用于确定类型的任何启发式方法与您无关。重要的是存储在文件中并由 getCellType() 显示的决策的结果。

回答by Neeraj

This below code works fine to read any celltype but that cell should contain numeric value

下面的代码可以很好地读取任何单元格类型,但该单元格应包含数值

new BigDecimal(cell.getNumericCellValue()));

e.g.

例如

ase.setGss(new BigDecimal(hssfRow.getCell(3).getNumericCellValue()));

where variable gss is of BigDecimal type.

其中变量 gss 是 BigDecimal 类型。

回答by Gagravarr

The class you're looking for in POI is DataFormatter

您在 POI 中寻找的课程是DataFormatter

When Excel writes the file, some cells are stored as literal Strings, while others are stored as numbers. For the latter, a floating point value representing the cell is stored in the file, so when you ask POI for the value of the cell that's what it actually has.

当 Excel 写入文件时,某些单元格存储为文字字符串,而其他单元格存储为数字。对于后者,表示单元格的浮点值存储在文件中,因此当您向 POI 询问单元格的值时,这就是它实际拥有的值。

Sometimes though, especially when doing Text Extraction (but not always), you want to make the cell value look like it does in Excel. It isn't always possible to get that exactly in a String (non full space padding for example), but the DataFormatter class will get you close.

但有时,尤其是在进行文本提取时(但并非总是如此),您希望单元格值看起来像在 Excel 中一样。并非总是可以在字符串中准确获取它(例如,非全空间填充),但 DataFormatter 类会让您接近。

If you're after a String of the cell, looking much as you had it looking in Excel, just do:

如果您正在寻找单元格的字符串,看起来就像您在 Excel 中看到的那样,只需执行以下操作:

 // Create a formatter, do this once
 DataFormatter formatter = new DataFormatter(Locale.US);

 .....

 for(Cell cell : row) {
     CellReference ref = new CellReference(cell);
     // eg "The value of B12 is 12.4%"
     System.out.println("The value of " + ref.formatAsString() + " is " + formatter.formatCellValue(cell));
 }

The formatter will return String cells as-is, and for Numeric cells will apply the formatting rules on the style to the number of the cell

格式化程序将按原样返回字符串单元格,对于数字单元格,会将样式上的格式规则应用于单元格的编号