Java 8 日期和时间:解析偏移量中没有冒号的 ISO 8601 字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46487403/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 02:22:48  来源:igfitidea点击:

Java 8 Date and Time: parse ISO 8601 string without colon in offset

javadatetimeiso8601timezone-offsetdatetime-parsing

提问by u_b

We try to parse the following ISO 8601 DateTime String with timezone offset:

我们尝试解析以下带有时区偏移量的 ISO 8601 日期时间字符串:

final String input = "2022-03-17T23:00:00.000+0000";

OffsetDateTime.parse(input);
LocalDateTime.parse(input, DateTimeFormatter.ISO_OFFSET_DATE_TIME);

Both approaches fail (which makes sense as OffsetDateTimealso use the DateTimeFormatter.ISO_OFFSET_DATE_TIME) because of the colon in the timezone offset.

由于时区偏移量中的冒号,这两种方法都失败了(OffsetDateTime也可以使用DateTimeFormatter.ISO_OFFSET_DATE_TIME)。

java.time.format.DateTimeParseException: Text '2022-03-17T23:00:00.000+0000' could not be parsed at index 23

java.time.format.DateTimeParseException:无法在索引 23 处解析文本“2022-03-17T23:00:00.000+0000”

But according to Wikipediathere are 4 valid formats for a timezone offset:

但根据维基百科,时区偏移有 4 种有效格式:

<time>Z 
<time>±hh:mm 
<time>±hhmm 
<time>±hh

Other frameworks/languages can parse this string without any issues, e.g. the Javascript Date()or Hymansons ISO8601Utils(they discuss this issue here)

其他框架/语言可以毫无问题地解析这个字符串,例如 JavascriptDate()或 Hymansons ISO8601Utils(他们在这里讨论这个问题)

Now we could write our own DateTimeFormatterwith a complex RegEx, but in my opinion the java.timelibrary should be able to parse this valid ISO 8601 string by default as it is a valid one.

现在我们可以DateTimeFormatter使用复杂的 RegEx编写我们自己的正则表达式,但在我看来java.time,默认情况下该库应该能够解析这个有效的 ISO 8601 字符串,因为它是一个有效的字符串。

For now we use Hymansons ISO8601DateFormat, but we would prefer to use the official date.timelibrary to work with. What would be your approach to tackle this issue?

现在我们使用 Hymansons ISO8601DateFormat,但我们更愿意使用官方date.time库来使用。你会用什么方法来解决这个问题?

采纳答案by Optional

If you want to parse all valid formats of offsets (Z, ±hh:mm, ±hhmmand ±hh), one alternative is to use a java.time.format.DateTimeFormatterBuilderwith optional patterns (unfortunatelly, it seems that there's no single pattern letter to match them all):

如果您想解析所有有效格式的偏移量(Z±hh:mm±hhmm±hh),一种替代方法是使用java.time.format.DateTimeFormatterBuilder带有可选模式的 a (不幸的是,似乎没有单个模式字母可以匹配所有模式):

DateTimeFormatter formatter = new DateTimeFormatterBuilder()
    // date/time
    .append(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
    // offset (hh:mm - "+00:00" when it's zero)
    .optionalStart().appendOffset("+HH:MM", "+00:00").optionalEnd()
    // offset (hhmm - "+0000" when it's zero)
    .optionalStart().appendOffset("+HHMM", "+0000").optionalEnd()
    // offset (hh - "Z" when it's zero)
    .optionalStart().appendOffset("+HH", "Z").optionalEnd()
    // create formatter
    .toFormatter();
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000+0000", formatter));
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000+00", formatter));
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000+00:00", formatter));
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000Z", formatter));

All the four cases above will parse it to 2022-03-17T23:00Z.

以上四种情况都会将其解析为2022-03-17T23:00Z.



You can also define a single string pattern if you want, using []to delimiter the optional sections:

如果需要,您还可以定义单个字符串模式,[]用于分隔可选部分:

// formatter with all possible offset patterns
DateTimeFormatter formatter = DateTimeFormatter
    .ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSS[xxx][xx][X]");

This formatter also works for all cases, just like the previous formatter above. Check the javadocto get more details about each pattern.

这个格式化程序也适用于所有情况,就像前面的格式化程序一样。检查javadoc以获取有关每个模式的更多详细信息。



Notes:

笔记:

  • A formatter with optional sections like the above is good for parsing, but not for formatting. When formatting, it'll print allthe optional sections, which means it'll print the offset many times. So, to format the date, just use another formatter.
  • The second formatter accepts exactly 3 digits after the decimal point (because of .SSS). On the other hand, ISO_LOCAL_DATE_TIMEis more flexible: the seconds and nanoseconds are optional, and it also accepts from 0 to 9 digits after the decimal point. Choose the one that works best for your input data.
  • 具有上述可选部分的格式化程序有利于解析,但不适用于格式化。格式化时,它将打印所有可选部分,这意味着它将多次打印偏移量。因此,要格式化日期,只需使用另一个格式化程序。
  • 第二个格式化程序只接受小数点后的 3 位数字(因为.SSS)。另一方面,ISO_LOCAL_DATE_TIME更灵活:秒和纳秒是可选的,它也接受小数点后的 0 到 9 位数字。选择最适合您的输入数据的一种。

回答by Jon Skeet

You don't need to write a complex regex - you can build a DateTimeFormatterthat will work with that format easily:

您不需要编写复杂的正则表达式 - 您可以构建一个可以DateTimeFormatter轻松使用该格式的正则表达式:

DateTimeFormatter formatter =
    DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ss.SSSX", Locale.ROOT);

OffsetDateTime odt = OffsetDateTime.parse(input, formatter);

That will also accept "Z" instead of "0000". It will notaccept "+00:00" (with the colon or similar. That's surprising given the documentation, but if your value always has the UTC offset without the colon, it should be okay.

这也将接受“Z”而不是“0000”。它不会接受“+00:00”(带有冒号或类似名称。鉴于文档,这令人惊讶,但如果您的值始终具有不带冒号的 UTC 偏移量,则应该没问题。

回答by Lothar

I wouldn't call it a solution but a workaround. SimpleDateFormat's Ztemplate supports the timezone-syntax you showed, so you can do something like this:

我不会称之为解决方案,而是一种解决方法。SimpleDateFormat 的Z模板支持您显示的时区语法,因此您可以执行以下操作:

final String input = "2022-03-17T23:00:00.000+0000";

try {
    OffsetDateTime.parse(input);
    LocalDateTime.parse(input, DateTimeFormatter.ISO_OFFSET_DATE_TIME);
}
catch (DateTimeParseException e) {
    SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SZ", Locale.GERMANY);
    sdf.parse(input);
}

You're still using official libraries shipped with the JVM. One isn't part of the date.time-library, but still ;-)

您仍在使用 JVM 附带的官方库。一个不是 date.time-library 的一部分,但仍然是 ;-)

回答by Optional

Since it is without colon, can you use your own format string :

由于它没有冒号,您可以使用自己的格式字符串:

final String input = "2022-03-17T23:00:00.000+0000";

    DateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ");
    Date parsed = df.parse(input);
    System.out.println(parsed);