Java 检查字符串是否可以在没有 try-catch 的情况下解析为 Long?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2563608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Check whether a string is parsable into Long without try-catch?
提问by Serg
Long.parseLong("string")
throws an error if string is not parsable into long.
Is there a way to validate the string faster than using try-catch
?
Thanks
Long.parseLong("string")
如果字符串无法解析为 long,则抛出错误。有没有办法比使用更快地验证字符串try-catch
?谢谢
采纳答案by Roman
You can create rather complex regular expression but it isn't worth that. Using exceptions here is absolutely normal.
您可以创建相当复杂的正则表达式,但这不值得。在这里使用异常是绝对正常的。
It's natural exceptional situation: you assume that there is an integer in the string but indeed there is something else. Exception should be thrown and handled properly.
这是自然的例外情况:您假设字符串中有一个整数,但实际上还有其他内容。异常应该被抛出并正确处理。
If you look inside parseLong
code, you'll see that there are many different verifications and operations. If you want to do all that stuff before parsing it'll decrease the performance (if we are talking about parsing millions of numbers because otherwise it doesn't matter). So, the only thing you can do if you really needto improve performance by avoiding exceptions is: copy parseLong
implementation to your own function and return NaN instead of throwing exceptions in all correspondent cases.
如果您查看parseLong
代码内部,您会发现有许多不同的验证和操作。如果你想在解析之前做所有这些事情,它会降低性能(如果我们正在谈论解析数百万个数字,否则它并不重要)。因此,如果您确实需要通过避免异常来提高性能,那么您唯一可以做的就是:将parseLong
实现复制到您自己的函数并返回 NaN 而不是在所有对应情况下都抛出异常。
回答by lexicore
From commons-lang StringUtils:
来自 commons-lang StringUtils:
public static boolean isNumeric(String str) {
if (str == null) {
return false;
}
int sz = str.length();
for (int i = 0; i < sz; i++) {
if (Character.isDigit(str.charAt(i)) == false) {
return false;
}
}
return true;
}
回答by Moonshield
You could try using a regular expression to check the form of the string before trying to parse it?
在尝试解析字符串之前,您可以尝试使用正则表达式来检查字符串的形式吗?
回答by ring bearer
You could do something like
你可以做类似的事情
if(s.matches("\d*")){
}
Using regular expression - to check if String s is full of digits. But what do you stand to gain? another if condition?
使用正则表达式 - 检查 String s 是否充满数字。但是你有什么好处呢?另一个 if 条件?
回答by cd1
I think that's the only way of checking if a String is a valid long value. but you can implement yourself a method to do that, having in mind the biggest long value.
我认为这是检查 String 是否为有效长值的唯一方法。但是你可以自己实现一个方法来做到这一点,记住最大的长期价值。
回答by SyntaxT3rr0r
There are much faster ways to parsea long than Long.parseLong. If you want to see an example of a method that is notoptimized then you should look at parseLong :)
有比Long.parseLong更快的解析long 的方法。如果您想查看未优化方法的示例,则应该查看 parseLong :)
Do you really need to take into account "digits" that are non-ASCII?
您真的需要考虑非 ASCII 的“数字”吗?
Do you really need to make several methods callspassing around a radix even tough you're probably parsing base 10?
你真的需要让几个方法调用传递一个基数,即使你可能解析基数 10 也很困难吗?
:)
:)
Using a regexp is not the way to go: it's harder to determine if you're number is too big for a long: how do you use a regexp to determine that 9223372036854775807 can be parsed to a long but that 9223372036854775907 cannot?
使用正则表达式不是要走的路:很难确定你的数字是否太大了很长时间:你如何使用正则表达式来确定 9223372036854775807 可以解析为很长但 9223372036854775907 不能?
That said, the answer to a really fast long parsing method isa state machine and that no matter if you want to test if it's parseable or to parse it. Simply, it's not a generic state machine accepting complex regexp but a hardcoded one.
也就是说,真正快速的长解析方法的答案是状态机,无论您是想测试它是否可解析或解析它。简单地说,它不是一个接受复杂正则表达式的通用状态机,而是一个硬编码的状态机。
I can both write you a method that parses a long and another one that determines if a long can be parsed that totally outperforms Long.parseLong().
我可以给你写一个解析 long 的方法和另一个确定是否可以解析 long 的方法,它完全优于Long.parseLong()。
Now what do you want? A state testing method? In that case a state testing method may not be desirable if you want to avoid computing twice the long.
现在你想要什么?一种状态测试方法?在这种情况下,如果您想避免计算两倍的时间,则可能不需要状态测试方法。
Simply wrap your call in a try/catch.
只需将您的呼叫包装在 try/catch 中。
And ifyou really want something faster than the default Long.parseLong, write one that istailored to your problem: base 10 if you're base 10, not checking digits outside ASCII (because you're probably not interested in Japanese's itchi-ni-yon-go etc.).
而且,如果你真的想要的东西,而不是默认的Long.parseLong,写一个更快的针对您的问题:基础10如果你是基地10,不检查数字之外ASCII(因为你很可能不会在日本的兴趣itchi妮-yy-go 等)。
回答by polygenelubricants
You can use java.util.Scanner
您可以使用 java.util.Scanner
Scanner sc = new Scanner(s);
if (sc.hasNextLong()) {
long num = sc.nextLong();
}
This does range checking etc, too. Of course it will say that "99 bottles of beer"
hasNextLong()
, so if you want to make sure that it onlyhas a long
you'd have to do extra checks.
这也进行范围检查等。当然它会这样说"99 bottles of beer"
hasNextLong()
,所以如果你想确保它只有一个,long
你必须做额外的检查。
回答by Bob Holmes
This case is common for forms and programs where you have the input field and are not sure if the string is a valid number. So using try/catch with your java function is the best thing to do if you understand how try/catch works compared to trying to write the function yourself. In order to setup the try catch block in .NET virtual machine, there is zero instructions of overhead, and it is probably the same in Java. If there are instructions used at the try keyword then these will be minimal, and the bulk of the instructions will be used at the catch part and that only happens in the rare case when the number is not valid.
对于具有输入字段但不确定字符串是否为有效数字的表单和程序,这种情况很常见。因此,与尝试自己编写函数相比,如果您了解 try/catch 的工作原理,那么将 try/catch 与您的 java 函数一起使用是最好的做法。为了在.NET虚拟机中设置try catch块,开销为零的指令,在Java中可能是一样的。如果在 try 关键字处使用了指令,那么这些指令将是最少的,并且大部分指令将在 catch 部分使用,并且仅在数字无效的极少数情况下才会发生。
So while it "seems" like you can write a faster function yourself, you would have to optimize it better than the Java compiler in order to beat the try/catch mechanism you already use, and the benefit of a more optimized function is going to be very minimal since number parsing is quite generic.
因此,虽然“似乎”您可以自己编写更快的函数,但您必须比 Java 编译器更好地优化它,以击败您已经使用的 try/catch 机制,并且更优化的函数的好处将是非常小,因为数字解析非常通用。
If you run timing tests with your compiler and the java catch mechanism you already described, you will probably not notice any above marginal slowdown, and by marginal I mean it should be almost nothing.
如果您使用编译器和您已经描述的 java catch 机制运行计时测试,您可能不会注意到任何上述边际减速,边际我的意思是它应该几乎没有。
Get the java language specification to understand the exceptions more and you will see that using such a technique in this case is perfectly acceptable since it wraps a fairly large and complex function. Adding on those few extra instructions in the CPU for the try part is not going to be such a big deal.
获取 java 语言规范以更多地了解异常,您会发现在这种情况下使用这种技术是完全可以接受的,因为它包装了一个相当大且复杂的函数。在 CPU 中为 try 部分添加一些额外的指令不会有什么大不了的。
回答by Woody
This is a valid question because there are times when you need to infer what type of data is being represented in a string. For example, you may need to import a large CSV into a database and represent the data types accurately. In such cases, calling Long.parseLong and catching an exception can be too slow.
这是一个有效的问题,因为有时您需要推断字符串中表示的数据类型。例如,您可能需要将大型 CSV 导入数据库并准确表示数据类型。在这种情况下,调用 Long.parseLong 并捕获异常可能会太慢。
The following code only handles ASCII decimal:
以下代码仅处理 ASCII 十进制:
public class LongParser {
// Since tryParseLong represents the value as negative during processing, we
// counter-intuitively want to keep the sign if the result is negative and
// negate it if it is positive.
private static final int MULTIPLIER_FOR_NEGATIVE_RESULT = 1;
private static final int MULTIPLIER_FOR_POSITIVE_RESULT = -1;
private static final int FIRST_CHARACTER_POSITION = 0;
private static final int SECOND_CHARACTER_POSITION = 1;
private static final char NEGATIVE_SIGN_CHARACTER = '-';
private static final char POSITIVE_SIGN_CHARACTER = '+';
private static final int DIGIT_MAX_VALUE = 9;
private static final int DIGIT_MIN_VALUE = 0;
private static final char ZERO_CHARACTER = '0';
private static final int RADIX = 10;
/**
* Parses a string representation of a long significantly faster than
* <code>Long.ParseLong</code>, and avoids the noteworthy overhead of
* throwing an exception on failure. Based on the parseInt code from
* http://nadeausoftware.com/articles/2009/08/java_tip_how_parse_integers_quickly
*
* @param stringToParse
* The string to try to parse as a <code>long</code>.
*
* @return the boxed <code>long</code> value if the string was a valid
* representation of a long; otherwise <code>null</code>.
*/
public static Long tryParseLong(final String stringToParse) {
if (stringToParse == null || stringToParse.isEmpty()) {
return null;
}
final int inputStringLength = stringToParse.length();
long value = 0;
/*
* The absolute value of Long.MIN_VALUE is greater than the absolute
* value of Long.MAX_VALUE, so during processing we'll use a negative
* value, then we'll multiply it by signMultiplier before returning it.
* This allows us to avoid a conditional add/subtract inside the loop.
*/
int signMultiplier = MULTIPLIER_FOR_POSITIVE_RESULT;
// Get the first character.
char firstCharacter = stringToParse.charAt(FIRST_CHARACTER_POSITION);
if (firstCharacter == NEGATIVE_SIGN_CHARACTER) {
// The first character is a negative sign.
if (inputStringLength == 1) {
// There are no digits.
// The string is not a valid representation of a long value.
return null;
}
signMultiplier = MULTIPLIER_FOR_NEGATIVE_RESULT;
} else if (firstCharacter == POSITIVE_SIGN_CHARACTER) {
// The first character is a positive sign.
if (inputStringLength == 1) {
// There are no digits.
// The string is not a valid representation of a long value.
return null;
}
} else {
// Store the (negative) digit (although we aren't sure yet if it's
// actually a digit).
value = -(firstCharacter - ZERO_CHARACTER);
if (value > DIGIT_MIN_VALUE || value < -DIGIT_MAX_VALUE) {
// The first character is not a digit (or a negative sign).
// The string is not a valid representation of a long value.
return null;
}
}
// Establish the "maximum" value (actually minimum since we're working
// with negatives).
final long rangeLimit = (signMultiplier == MULTIPLIER_FOR_POSITIVE_RESULT)
? -Long.MAX_VALUE
: Long.MIN_VALUE;
// Capture the maximum value that we can multiply by the radix without
// overflowing.
final long maxLongNegatedPriorToMultiplyingByRadix = rangeLimit / RADIX;
for (int currentCharacterPosition = SECOND_CHARACTER_POSITION;
currentCharacterPosition < inputStringLength;
currentCharacterPosition++) {
// Get the current digit (although we aren't sure yet if it's
// actually a digit).
long digit = stringToParse.charAt(currentCharacterPosition)
- ZERO_CHARACTER;
if (digit < DIGIT_MIN_VALUE || digit > DIGIT_MAX_VALUE) {
// The current character is not a digit.
// The string is not a valid representation of a long value.
return null;
}
if (value < maxLongNegatedPriorToMultiplyingByRadix) {
// The value will be out of range if we multiply by the radix.
// The string is not a valid representation of a long value.
return null;
}
// Multiply by the radix to slide all the previously parsed digits.
value *= RADIX;
if (value < (rangeLimit + digit)) {
// The value would be out of range if we "added" the current
// digit.
return null;
}
// "Add" the digit to the value.
value -= digit;
}
// Return the value (adjusting the sign if needed).
return value * signMultiplier;
}
}
回答by Hannes
Hope this helps with the positive values. I used this method once for validating database primary keys.
希望这有助于积极的价值观。我曾经使用这种方法来验证数据库主键。
private static final int MAX_LONG_STR_LEN = Long.toString(Long.MAX_VALUE).length();
public static boolean validId(final CharSequence id)
{
//avoid null
if (id == null)
{
return false;
}
int len = id.length();
//avoid empty or oversize
if (len < 1 || len > MAX_LONG_STR_LEN)
{
return false;
}
long result = 0;
// ASCII '0' at position 48
int digit = id.charAt(0) - 48;
//first char cannot be '0' in my "id" case
if (digit < 1 || digit > 9)
{
return false;
}
else
{
result += digit;
}
//start from 1, we already did the 0.
for (int i = 1; i < len; i++)
{
// ASCII '0' at position 48
digit = id.charAt(i) - 48;
//only numbers
if (digit < 0 || digit > 9)
{
return false;
}
result *= 10;
result += digit;
//if we hit 0x7fffffffffffffff
// we are at 0x8000000000000000 + digit - 1
// so negative
if (result < 0)
{
//overflow
return false;
}
}
return true;
}