java 如何使用正则表达式有效地向后搜索?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2355293/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use a regex to search backwards effectively?
提问by Asaf
I'm searching forward in an array of strings with a regex, like this:
我正在使用正则表达式在字符串数组中向前搜索,如下所示:
for (int j = line; j < lines.length; j++) {
if (lines[j] == null || lines[j].isEmpty()) {
continue;
}
matcher = pattern.matcher(lines[j]);
if (matcher.find(offset)) {
offset = matcher.end();
line = j;
System.out.println("found \""+matcher.group()+"\" at line "+line+" ["+matcher.start()+","+offset+"]");
return true;
}
offset = 0;
}
return false;
Note that in my implementation above I save the lineand offsetfor continuous searches.
请注意,在我上面的实现中,我保存了line和offset以进行连续搜索。
Anyway, now I want to search backwardsfrom that [line,offset].
无论如何,现在我想从那个 [line,offset]向后搜索。
My question: is there a way to search backwards with a regex efficiently? if not, what could be an alternative?
我的问题:有没有办法用正则表达式有效地向后搜索?如果没有,还有什么替代方法?
Clarification:By backwardsI mean finding the previous match.
For example, say that I'm searching for "dana" in
澄清:通过向后我的意思是找到以前的比赛。
例如,假设我正在搜索“dana”
"dana nama? dana kama! lama dana kama?"
and got to the 2nd match. If I do matcher.find()again, I'll search forwardand get the 3rd match. But I want to search backwardsand get to the 1st match.
the code above should then output something like:
并参加了第二场比赛。如果我再做matcher.find()一次,我会向前搜索并获得第 3 场比赛。但我想向后搜索并进入第一场比赛。
上面的代码应该输出如下内容:
found "dana" at line 0 [0,3] // fwd
found "dana" at line 0 [11,14] // fwd
found "dana" at line 0 [0,3] // bwd
采纳答案by Jan Goyvaerts
Java's regular expression engine cannot search backwards. In fact, the only regex engine that I know that can do that is the one in .NET.
Java 的正则表达式引擎不能向后搜索。事实上,我所知道的唯一可以做到这一点的正则表达式引擎是 .NET 中的引擎。
Instead of searching backwards, iterate over all the matches in a loop (searching forward). If the match is prior to the position you want, remember it. If the match is after the position you want, exit from the loop. In pseudo code (my Java is a little rusty):
不是向后搜索,而是在循环中迭代所有匹配项(向前搜索)。如果匹配在您想要的位置之前,请记住它。如果匹配在您想要的位置之后,则退出循环。在伪代码中(我的 Java 有点生疏):
storedmatch = ""
while matcher.find {
if matcher.end < offset {
storedmatch = matcher.group()
} else {
return storedmatch
}
}
回答by luca
The following class search backward and forward (of course).
以下类向后和向前搜索(当然)。
I used this class in an application where the users can search strings in a long text (like search feature in a Web browser). So it's tested and works well for practical use cases.
我在一个应用程序中使用了这个类,用户可以在其中搜索长文本中的字符串(如 Web 浏览器中的搜索功能)。因此它已经过测试并且适用于实际用例。
It uses an approach similar to what Jan Goyvaerts describes. It selects a block of text before the start position and searches it forwards, returning the last match if there is one. If there is no match if selects a new block of text before the of block and searches that in the same way.
它使用类似于 Jan Goyvaerts 描述的方法。它在开始位置之前选择一个文本块并向前搜索,如果有则返回最后一个匹配项。如果没有匹配,则在 of 块之前选择一个新的文本块并以相同的方式搜索。
Use it like this:
像这样使用它:
Search s = new Search("Big long text here to be searched [...]");
s.setPattern("some regexp");
// search backwards or forward as many times as you like,
// the class keeps track where the last match was
MatchResult where = s.searchBackward();
where = s.searchBackward(); // next match
where = s.searchBackward(); // next match
//or search forward
where = s.searchForward();
where = s.searchForward();
And the class:
和班级:
import java.util.regex.MatchResult;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/*
* Search regular expressions or simple text forward and backward in a CharSequence
*
*
* To simulate the backward search (that Java class doesn't have) the input data
* is divided into chunks and each chunk is searched from last to first until a
* match is found (inter-chunk matches are returned from last to first too).
*
* The search can fail if the pattern/match you look for is longer than the chunk
* size, but you can set the chunk size to a sensible size depending on the specific
* application.
*
* Also, because the match could span between two adjacent chunks, the chunks are
* partially overlapping. Again, this overlapping size should be set to a sensible
* size.
*
* A typical application where the user search for some words in a document will
* work perfectly fine with default values. The matches are expected to be between
* 10-15 chars, so any chunk size and overlapping size bigger than this expected
* length will be fine.
*
* */
public class Search {
private int BACKWARD_BLOCK_SIZE = 200;
private int BACKWARD_OVERLAPPING = 20;
private Matcher myFwdMatcher;
private Matcher myBkwMatcher;
private String mySearchPattern;
private int myCurrOffset;
private boolean myRegexp;
private CharSequence mySearchData;
public Search(CharSequence searchData) {
mySearchData = searchData;
mySearchPattern = "";
myCurrOffset = 0;
myRegexp = true;
clear();
}
public void clear() {
myFwdMatcher = null;
myBkwMatcher = null;
}
public String getPattern() {
return mySearchPattern;
}
public void setPattern(String toSearch) {
if ( !mySearchPattern.equals(toSearch) ) {
mySearchPattern = toSearch;
clear();
}
}
public CharSequence getText() {
return mySearchData;
}
public void setText(CharSequence searchData) {
mySearchData = searchData;
clear();
}
public void setSearchOffset(int startOffset) {
if (myCurrOffset != startOffset) {
myCurrOffset = startOffset;
clear();
}
}
public boolean isRegexp() {
return myRegexp;
}
public void setRegexp(boolean regexp) {
if (myRegexp != regexp) {
myRegexp = regexp;
clear();
}
}
public MatchResult searchForward() {
if (mySearchData != null) {
boolean found;
if (myFwdMatcher == null)
{
// if it's a new search, start from beginning
String searchPattern = myRegexp ? mySearchPattern : Pattern.quote(mySearchPattern);
myFwdMatcher = Pattern.compile(searchPattern, Pattern.CASE_INSENSITIVE).matcher(mySearchData);
try {
found = myFwdMatcher.find(myCurrOffset);
} catch (IndexOutOfBoundsException e) {
found = false;
}
}
else
{
// continue searching
found = myFwdMatcher.hitEnd() ? false : myFwdMatcher.find();
}
if (found) {
MatchResult result = myFwdMatcher.toMatchResult();
return onMatchResult(result);
}
}
return onMatchResult(null);
}
public MatchResult searchBackward() {
if (mySearchData != null) {
myFwdMatcher = null;
if (myBkwMatcher == null)
{
// if it's a new search, create a new matcher
String searchPattern = myRegexp ? mySearchPattern : Pattern.quote(mySearchPattern);
myBkwMatcher = Pattern.compile(searchPattern, Pattern.CASE_INSENSITIVE).matcher(mySearchData);
}
MatchResult result = null;
boolean startOfInput = false;
int start = myCurrOffset;
int end = start;
while (result == null && !startOfInput)
{
start -= BACKWARD_BLOCK_SIZE;
if (start < 0) {
start = 0;
startOfInput = true;
}
try {
myBkwMatcher.region(start, end);
} catch (IndexOutOfBoundsException e) {
break;
}
while ( myBkwMatcher.find() ) {
result = myBkwMatcher.toMatchResult();
}
end = start + BACKWARD_OVERLAPPING; // depending on the size of the pattern this could not be enough
//but how can you know the size of a regexp match beforehand?
}
return onMatchResult(result);
}
return onMatchResult(null);
}
private MatchResult onMatchResult(MatchResult result) {
if (result != null) {
myCurrOffset = result.start();
}
return result;
}
}
And if you like to test the class here is an usage example:
如果你想测试这个类,这里有一个用法示例:
import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import javax.swing.event.*;
import java.util.regex.MatchResult;
import javax.swing.text.DefaultHighlighter;
import javax.swing.text.BadLocationException;
public class SearchTest extends JPanel implements ActionListener {
protected JScrollPane scrollPane;
protected JTextArea textArea;
protected boolean docChanged = true;
protected Search searcher;
public SearchTest() {
super(new BorderLayout());
searcher = new Search("");
JButton backButton = new JButton("Search backward");
JButton fwdButton = new JButton("Search forward");
JPanel buttonPanel = new JPanel(new BorderLayout());
buttonPanel.add(fwdButton, BorderLayout.EAST);
buttonPanel.add(backButton, BorderLayout.WEST);
textArea = new JTextArea("Big long text here to be searched...", 20, 40);
textArea.setEditable(true);
scrollPane = new JScrollPane(textArea);
final JTextField textField = new JTextField(40);
//Add Components to this panel.
add(buttonPanel, BorderLayout.NORTH);
add(scrollPane, BorderLayout.CENTER);
add(textField, BorderLayout.SOUTH);
//Add actions
backButton.setActionCommand("back");
fwdButton.setActionCommand("fwd");
backButton.addActionListener(this);
fwdButton.addActionListener(this);
textField.addActionListener( new ActionListener() {
public void actionPerformed(ActionEvent e) {
final String pattern = textField.getText();
searcher.setPattern(pattern);
}
} );
textArea.getDocument().addDocumentListener( new DocumentListener() {
public void insertUpdate(DocumentEvent e) { docChanged = true; }
public void removeUpdate(DocumentEvent e) { docChanged = true; }
public void changedUpdate(DocumentEvent e) { docChanged = true; }
});
}
public void actionPerformed(ActionEvent e) {
if ( docChanged ) {
final String newDocument = textArea.getText();
searcher.setText(newDocument);
docChanged = false;
}
MatchResult where = null;
if ("back".equals(e.getActionCommand())) {
where = searcher.searchBackward();
} else if ("fwd".equals(e.getActionCommand())) {
where = searcher.searchForward();
}
textArea.getHighlighter().removeAllHighlights();
if (where != null) {
final int start = where.start();
final int end = where.end();
// highligh result and scroll
try {
textArea.getHighlighter().addHighlight(start, end, new DefaultHighlighter.DefaultHighlightPainter(Color.yellow));
} catch (BadLocationException excp) {}
textArea.scrollRectToVisible(new Rectangle(0, 0, scrollPane.getViewport().getWidth(), scrollPane.getViewport().getHeight()));
SwingUtilities.invokeLater(new Runnable() {
@Override
public void run() { textArea.setCaretPosition(start); }
});
} else if (where == null) {
// no match, so let's wrap around
if ("back".equals(e.getActionCommand())) {
searcher.setSearchOffset( searcher.getText().length() -1 );
} else if ("fwd".equals(e.getActionCommand())) {
searcher.setSearchOffset(0);
}
}
}
private static void createAndShowGUI() {
//Create and set up the window.
JFrame frame = new JFrame("SearchTest");
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
//Add contents to the window.
frame.add(new SearchTest());
//Display the window.
frame.pack();
frame.setVisible(true);
}
public static void main(String[] args) {
//Schedule a job for the event dispatch thread:
//creating and showing this application's GUI.
javax.swing.SwingUtilities.invokeLater(new Runnable() {
public void run() {
createAndShowGUI();
}
});
}
}
回答by kosotd
I use the following simple class to search backwards in java
我使用以下简单的类在 java 中向后搜索
public class ReverseMatcher {
private final Matcher _matcher;
private final Stack<MatchResult> _results = new Stack<>();
public ReverseMatcher(Matcher matcher){
_matcher = matcher;
}
public boolean find(){
return find(_matcher.regionEnd());
}
public boolean find(int start){
if (_results.size() > 0){
_results.pop();
return _results.size() > 0;
}
boolean res = false;
while (_matcher.find()){
if (_matcher.end() > start)
break;
res = true;
_results.push(_matcher.toMatchResult());
}
return res;
}
public String group(int group){
return _results.peek().group(group);
}
public String group(){
return _results.peek().group();
}
public int start(){
return _results.peek().start();
}
public int end(){
return _results.peek().end();
}
}
using:
使用:
String srcString = "1 2 3 4 5 6 7 8 9";
String pattern = "\b[0-9]*\b";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(srcString);
ReverseMatcher rm = new ReverseMatcher(m);
while (rm.find())
System.out.print(rm.group() + " ");
output: 9 8 7 6 5 4 3 2 1
输出:9 8 7 6 5 4 3 2 1
or
或者
while (rm.find(9))
System.out.print(rm.group() + " ");
output: 5 4 3 2 1
输出:5 4 3 2 1
回答by buildingKofi
If a previous match is something, that you have already matched going forward, then what about creating a list of matched positions while searching forward and then just use it to jump back instead of searching backward?
如果之前的匹配是某种东西,您已经匹配了,那么在向前搜索时创建一个匹配位置列表,然后使用它跳回而不是向后搜索呢?
回答by SF.
Is the search string strictly a regex (full, rich syntax?) Because if not, for(int j = line; j >= 0 ; j--), reverse the line, reverse the match and search forward ;)
搜索字符串是否严格为正则表达式(完整、丰富的语法?)因为如果不是for(int j = line; j >= 0 ; j--),则反转行,反转匹配并向前搜索;)


