Java - 使用正则表达式从字符串中提取日期失败

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18591242/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 09:18:56  来源:igfitidea点击:

Java - extract date from string using regex- failing

javaregexdate

提问by DasDas

I'm trying to extract 2 dates from a string using regex- and for some reason - the regex doesn't extract dates- this is my code:

我正在尝试使用正则表达式从字符串中提取 2 个日期-出于某种原因-正则表达式不提取日期-这是我的代码:

private  String[] getDate(String desc) {
    int count=0;
    String[] allMatches = new String[2];
    Matcher m = Pattern.compile("(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\d\d(?:,)").matcher(desc);
    while (m.find()) {
        allMatches[count] = m.group();
    }
    return allMatches;
}

My string- desc is: "coming from the 11/25/2009 to the 11/30/2009"and I get back a null array...

我的 string-desc 是:"coming from the 11/25/2009 to the 11/30/2009"我得到一个空数组...

采纳答案by Syon

You've got the month and day of the month backwards, and (?:,)is requiring a comma at the end of each date. Try this instead:

您将月份的月份和日期倒退,并且(?:,)需要在每个日期的末尾使用逗号。试试这个:

(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d

回答by sp00m

Your regex matches day first and then month (DD/MM/YYYY), while your inputs start with month and then day (MM/DD/YYYY).

您的正则表达式先匹配日,然后匹配月 (DD/MM/YYYY),而您的输入从月开始,然后是日 (MM/DD/YYYY)。

Moreover, your dates must be followed by a comma to be matched (the (?:,)part).

此外,您的日期必须后跟逗号才能匹配((?:,)部分)。

This one should suit your needs:

这个应该适合您的需求:

(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d

Regular expression visualization

正则表达式可视化

Diagram by Debuggex.

由图Debuggex

回答by Arnaud Denoyelle

3 Problems :

3 问题:

1) You are trying to parse date with format dd/MM/YYYYwhere as your regex has format MM/dd/YYYY.

1)您正在尝试使用 format 解析日期dd/MM/YYYY,因为您的正则表达式具有 format MM/dd/YYYY

2) You forgot to increment countin the while loop.

2)您忘记count在 while 循环中递增。

3) The (?:,)part at the end of the regex is useless.

3)(?:,)正则表达式末尾的部分没用。

This codes works on my computer :

此代码适用于我的计算机:

private static String[] getDate(String desc) {
  int count=0;
  String[] allMatches = new String[2];
  Matcher m = Pattern.compile("(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\d\d").matcher(desc);
  while (m.find()) {
    allMatches[count] = m.group();
    count++;
  }
  return allMatches;
}

Test:

测试

public static void main(String[] args) throws Exception{
  String[] dates = getDate("coming from the 25/11/2009 to the 30/11/2009");

  System.out.println(dates[0]);
  System.out.println(dates[1]);

}

Output:

输出

25/11/2009
30/11/2009

回答by vaibhav singh

A date pattern recognition algorithm to not only identify date pattern but also fetches probable date in Java date format. This algorithm is very fast and lightweight. The processing time is linear and all dates are identified in a single pass. Algorithm resolves date using tree traverse mechanism. Tree data structures are custom created to build supported date, time and month patterns.

一种日期模式识别算法,不仅可以识别日期模式,还可以获取 Java 日期格式的可能日期。该算法非常快速且轻量级。处理时间是线性的,所有日期都在一次传递中确定。算法使用树遍历机制解析日期。树数据结构是自定义创建的,以构建受支持的日期、时间和月份模式。

The algorithm also acknowledges multiple space characters in between Date literals. E.g. DD DD DD and DD DD DD are considered as valid dates.

该算法还确认日期文字之间的多个空格字符。例如,DD DD DD 和 DD DD DD 被视为有效日期。

Following date-patterns are considered as valid and are identifiable using this algorithm.

以下日期模式被认为是有效的,并且可以使用此算法进行识别。

dd MM(MM) yy(yy) yy(yy) MM(MM) dd MM(MM) dd yy(yy)

dd MM(MM) yy(yy) yy(yy) MM(MM) dd MM(MM) dd yy(yy)

Where M is month literal is alphabet format like Jan or January

其中 M 是月份文字是字母格式,如 Jan 或 January

Allowed delimiters between dates are '/', '\', ' ', ',', '|', '-', ' '

日期之间允许的分隔符为 '/'、'\'、' '、','、'|'、'-'、' '

It also recognizes trailing time pattern in following format hh(24):mm:ss.SSS am / pm hh(24):mm:ss am / pm hh(24):mm:ss am / pm

它还识别以下格式的拖尾时间模式 hh(24):mm:ss.SSS am / pm hh(24):mm:ss am / pm hh(24):mm:ss am / pm

Resolution time is linear, no pattern matching or brute force is used. This algorithm is based on tree traversal and returns back, the list of date with following three components - date string identified in the text - converted & formatted date string - SimpleDateFormat

解析时间是线性的,不使用模式匹配或蛮力。该算法基于树遍历并返回,日期列表具有以下三个组成部分 - 文本中标识的日期字符串 - 转换和格式化的日期字符串 - SimpleDateFormat

Using date string and the format string, users are free to convert the string into objects based on their requirements.

使用日期字符串和格式字符串,用户可以根据自己的需要自由地将字符串转换为对象。

The algorithm library is available at maven central.

算法库可在 Maven 中心获得。

<dependency>
    <groupId>net.rationalminds</groupId>
    <artifactId>DateParser</artifactId>
    <version>0.3.0</version>
</dependency>

The sample code to use this is below.

使用它的示例代码如下。

import java.util.List;  
 import net.rationalminds.LocalDateModel;  
 import net.rationalminds.Parser;  
 public class Test {  
   public static void main(String[] args) throws Exception {  
        Parser parser=new Parser();  
        List<LocalDateModel> dates=parser.parse("Identified date :'2015-January-10 18:00:01.704', converted");  
        System.out.println(dates);  
   }  
 }  

Output: [LocalDateModel{originalText=2015-january-10 18:00:01.704, dateTimeString=2015-1-10 18:00:01.704, conDateFormat=yyyy-MM-dd HH:mm:ss.SSS, start=18, end=46}]

输出:[LocalDateModel{originalText=2015-january-10 18:00:01.704, dateTimeString=2015-1-10 18:00:01.704, conDateFormat=yyyy-MM-dd HH:mm:ss.SSS, start=18,结束=46}]

Detailed blog at http://coffeefromme.blogspot.com/2015/10/how-to-extract-date-object-from-given.html

详细博客在http://coffeefromme.blogspot.com/2015/10/how-to-extract-date-object-from-given.html

The complete source is available on GitHub at https://github.com/vbhavsingh/DateParser

完整源代码可在 GitHub 上获取,网址https://github.com/vbhavsingh/DateParser

回答by Basil Bourque

LocalTime.parseinstead of regex

LocalTime.parse而不是正则表达式

Regex can be overkill for such a problem.

对于这样的问题,正则表达式可能有点矫枉过正。

You could just split the string on SPACE character, and attempt to parse each element as a LocalDate. If the parse fails, move on to the next element.

您可以在空格字符上拆分字符串,并尝试将每个元素解析为LocalDate. 如果解析失败,则转到下一个元素。

String input = "coming from the 11/25/2009 to the 11/30/2009" ;
String[] elements = input.split( " " ) ; 
DateTimeFormatter f = DateTimeFormatter.ofPattern( "MM/dd/uuuu" ) ;
List<LocalDate> dates = new ArrayList<>() ;
for( String element : elements ) {
    try {
        LocalDate ld = LocalDate.parse( element , f ) ;
        dates.add( ld ) ;
    } catch ( DateTimeParseException e ) {
        // Ignore the exception. Move on to next element.
    }
}
System.out.println( "dates: " + dates ) ;

See this code run live at IdeOne.com.

查看此代码在 IdeOne.com 上实时运行

dates: [2009-11-25, 2009-11-30]

日期: [2009-11-25, 2009-11-30]