java 将数组与java中的字符串匹配

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1672416/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 17:32:27  来源:igfitidea点击:

match array against string in java

javaarraysstringmatch

提问by karunga

I'm reading a file using bufferedreader, so lets say i have

我正在使用 bufferedreader 读取文件,所以可以说我有

line = br.readLine();

I want to check if this line contains one of many possible strings (which i have in an array). I would like to be able to write something like:

我想检查这一行是否包含许多可能的字符串之一(我在数组中)。我希望能够写出类似的东西:

while (!line.matches(stringArray) { // not sure how to write this conditional
  do something here;
  br.readLine();
}

I'm fairly new to programming and Java, am I going about this the right way?

我对编程和 Java 还很陌生,我这样做对吗?

采纳答案by Aaron Digulla

Copy all values into a Set<String>and then use contains():

将所有值复制到 a 中Set<String>,然后使用contains()

Set<String> set = new HashSet<String> (Arrays.asList (stringArray));
while (!set.contains(line)) { ... }

[EDIT] If you want to find out if a part of the linecontains a string from the set, you have to loop over the set. Replace set.contains(line)with a call to:

[编辑] 如果您想找出该行的一部分是否包含集合中的字符串,则必须遍历该集合。替换set.contains(line)为调用:

public boolean matches(Set<String> set, String line) {
    for (String check: set) {
        if (line.contains(check)) return true;
    }
    return false;
}

Adjust the check accordingly when you use regexp or a more complex method for matching.

当您使用正则表达式或更复杂的匹配方法时,相应地调整检查。

[EDIT2] A third option is to concatenate the elements in the array in a huge regexp with |:

[EDIT2] 第三个选项是将数组中的元素连接在一个巨大的正则表达式中|

Pattern p = Pattern.compile("str1|str2|str3");

while (!p.matcher(line).find()) { // or matches for a whole-string match
    ...
}

This can be more cheap if you have many elements in the array since the regexp code will optimize the matching process.

如果数组中有很多元素,这会更便宜,因为正则表达式代码将优化匹配过程。

回答by cletus

It depends on what stringArrayis. If it's a Collectionthen fine. If it's a true array, you should make it a Collection. The Collectioninterface has a method called contains()that will determine if a given Objectis in the Collection.

这取决于是什么stringArray。如果是Collection这样就好了。如果它是一个真正的数组,则应该将其设为Collection. 该Collection接口有一个被调用的方法contains(),该方法将确定给定的对象Object是否在Collection.

Simple way to turn an array into a Collection:

将数组转换为 a 的简单方法Collection

String tokens[] = { ... }
List<String> list = Arrays.asList(tokens);

The problem with a Listis that lookup is expensive (technically linear or O(n)). A better bet is to use a Set, which is unordered but has near-constant (O(1)) lookup. You can construct one like this:

a 的问题List在于查找代价高昂(技术上是线性的或O(n))。更好的选择是使用 a Set,它是无序的,但具有接近常数 ( O(1)) 的查找。你可以这样构造一个:

From a Collection:

来自Collection

Set<String> set = new HashSet<String>(stringList);

From an array:

从数组:

Set<String> set = new HashSet<String>(Arrays.asList(stringArray));

and then set.contains(line)will be a cheap operation.

然后set.contains(line)将是一个廉价的操作。

Edit:Ok, I think your question wasn't clear. You want to see if the line contains any of the words in the array. What you want then is something like this:

编辑:好的,我认为你的问题不清楚。您想查看该行是否包含数组中的任何单词。你想要的是这样的:

BufferedReader in = null;
Set<String> words = ... // construct this as per above
try {
  in = ...
  while ((String line = in.readLine()) != null) {
    for (String word : words) {
      if (line.contains(word)) [
        // do whatever
      }
    }
  }
} catch (Exception e) {
  e.printStackTrace();
} finally {
  if (in != null) { try { in.close(); } catch (Exception e) { } }
}

This is quite a crude check, which is used surprisingly open and tends to give annoying false positives on words like "scrap". For a more sophisticated solution you probably have to use regular expression and look for word boundaries:

这是一个相当粗略的检查,它的使用令人惊讶地开放,并且往往会在诸如“scrap”之类的词上产生令人讨厌的误报。对于更复杂的解决方案,您可能必须使用正则表达式并查找单词边界:

Pattern p = Pattern.compile("(?<=\b)" + word + "(?=\b)");
Matcher m = p.matcher(line);
if (m.find() {
  // word found
}

You will probably want to do this more efficiently (like not compiling the pattern with every line) but that's the basic tool to use.

您可能希望更有效地执行此操作(例如不为每一行编译模式),但这是使用的基本工具。

回答by Adrian Park

Using the String.matches(regex)function, what about creating a regular expression that matches any one of the strings in the string array? Something like

使用该String.matches(regex)函数,如何创建一个匹配字符串数组中任意一个字符串的正则表达式呢?就像是

String regex = "*(";
for(int i; i < array.length-1; ++i)
  regex += array[i] + "|";
regex += array[array.length] + ")*";
while( line.matches(regex) )
{
  //. . . 
}