string 查找最长子串而不重复字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9734474/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:25:20  来源:igfitidea点击:

Find longest substring without repeating characters

stringalgorithm

提问by Rajendra Uppal

Given a string Sof length Nfind longest substring without repeating characters.

给定 a string Sof length Nfind 最长子串,没有重复字符。

Example:

例子:

Input:"stackoverflow"

输入:“堆栈溢出”

Output:"stackoverfl"

输出:“stackoverfl”

If there are two such candidates, return first from left. I need linear time and constant space algorithm.

如果有两个这样的候选人,从左边第一个返回。我需要线性时间和恒定空间算法。

回答by Karoly Horvath

  1. You are going to need a start and an end locator(/pointer) for the string and an array where you store information for each character: did it occour at least once?

  2. Start at the beginning of the string, both locators point to the start of the string.

  3. Move the end locator to the right till you find a repetition (or reach the end of the string). For each processed character, store it in the array. When stopped store the position if this is the largest substring. Also remember the repeated character.

  4. Now do the same thing with the start locator, when processing each character, remove its flags from the array. Move the locator till you find the earlier occurrence of the repeated character.

  5. Go back to step 3 if you haven't reached the end of string.

  1. 您将需要一个字符串的开始和结束定位器(/指针)和一个数组,您可以在其中存储每个字符的信息:它是否至少出现过一次?

  2. 从字符串的开头开始,两个定位器都指向字符串的开头。

  3. 向右移动结束定位符,直到找到重复(或到达字符串的末尾)。对于每个处理过的字符,将其存储在数组中。如果这是最大的子字符串,则在停止时存储位置。还要记住重复的字符。

  4. 现在对开始定位器做同样的事情,在处理每个字符时,从数组中删除它的标志。移动定位器,直到找到较早出现的重复字符。

  5. 如果您还没有到达字符串的末尾,请返回步骤 3。

Overall: O(N)

总体:O(N)

回答by Rajendra Uppal

import java.util.HashSet;

public class SubString {
    public static String subString(String input){

        HashSet<Character> set = new HashSet<Character>();

        String longestOverAll = "";
        String longestTillNow = "";

        for (int i = 0; i < input.length(); i++) {
            char c = input.charAt(i);

            if (set.contains(c)) {
                longestTillNow = "";
                set.clear();
            }
            longestTillNow += c;
            set.add(c);
            if (longestTillNow.length() > longestOverAll.length()) {
                longestOverAll = longestTillNow;
            }
        }

        return longestOverAll;
    }

    public static void main(String[] args) {
        String input = "substringfindout";
        System.out.println(subString(input));
    }
}

回答by aelguindy

You keep an array indicating the position at which a certain character occurred last. For convenience all characters occurred at position -1. You iterate on the string keeping a window, if a character is repeated in that window, you chop off the prefix that ends with the first occurrence of this character. Throughout, you maintain the longest length. Here's a python implementation:

您保留一个数组,指示某个字符最后出现的位置。为方便起见,所有字符都出现在位置 -1。您在保持窗口的字符串上迭代,如果在该窗口中重复一个字符,则切断以该字符第一次出现结尾的前缀。在整个过程中,您保持最长的长度。这是一个python实现:

def longest_unique_substr(S):
  # This should be replaced by an array (size = alphabet size).
  last_occurrence = {} 
  longest_len_so_far = 0
  longest_pos_so_far = 0
  curr_starting_pos = 0
  curr_length = 0

  for k, c in enumerate(S):
    l = last_occurrence.get(c, -1)
    # If no repetition within window, no problems.
    if l < curr_starting_pos: 
        curr_length += 1
    else:
        # Check if it is the longest so far
        if curr_length > longest_len_so_far: 
            longest_pos_so_far = curr_starting_pos
            longest_len_so_far = curr_length
        # Cut the prefix that has repetition
        curr_length -= l - curr_starting_pos
        curr_starting_pos = l + 1
    # In any case, update last_occurrence
    last_occurrence[c] = k

  # Maybe the longest substring is a suffix
  if curr_length > longest_len_so_far:
    longest_pos_so_far = curr_starting_pos
    longest_len_so_far = curr_length

  return S[longest_pos_so_far:longest_pos_so_far + longest_len_so_far]

回答by kasavbere

EDITED:

编辑:

following is an implementation of the concesus. It occured to me after my original publication. so as not to delete original, it is presented following:

以下是 concesus 的实现。这是我最初发表后想到的。为不删除原文,现提出如下:

public static String longestUniqueString(String S) {
    int start = 0, end = 0, length = 0;
    boolean bits[] = new boolean[256];
    int x = 0, y = 0;
    for (; x < S.length() && y < S.length() && length < S.length() - x; x++) {
        bits[S.charAt(x)] = true;
        for (y++; y < S.length() && !bits[S.charAt(y)]; y++) {
            bits[S.charAt(y)] = true;
        }
        if (length < y - x) {
            start = x;
            end = y;
            length = y - x;
        }
        while(y<S.length() && x<y && S.charAt(x) != S.charAt(y))
            bits[S.charAt(x++)]=false;
    }
    return S.substring(start, end);
}//

ORIGINAL POST:

原帖:

Here is my two cents. Test strings included. boolean bits[] = new boolean[256] may be larger to encompass some larger charset.

这是我的两分钱。包括测试字符串。boolean bits[] = new boolean[256] 可能更大以包含一些更大的字符集。

public static String longestUniqueString(String S) {
    int start=0, end=0, length=0;
    boolean bits[] = new boolean[256];
    int x=0, y=0;
    for(;x<S.length() && y<S.length() && length < S.length()-x;x++) {
        Arrays.fill(bits, false);
        bits[S.charAt(x)]=true;
        for(y=x+1;y<S.length() && !bits[S.charAt(y)];y++) {
            bits[S.charAt(y)]=true;
        }           
        if(length<y-x) {
            start=x;
            end=y;
            length=y-x;
        }
    }
    return S.substring(start,end);
}//

public static void main(String... args) {
    String input[][] = { { "" }, { "a" }, { "ab" }, { "aab" }, { "abb" },
            { "aabc" }, { "abbc" }, { "aabbccdefgbc" },
            { "abcdeafghicabcdefghijklmnop" },
            { "abcdeafghicabcdefghijklmnopqrabcdx" },
            { "zxxaabcdeafghicabcdefghijklmnopqrabcdx" },
            {"aaabcdefgaaa"}};
    for (String[] a : input) {
        System.out.format("%s  *** GIVES ***  {%s}%n", Arrays.toString(a),
                longestUniqueString(a[0]));
    }
}

回答by Saurin

Here is one more solution with only 2 string variables:

这是另一种只有 2 个字符串变量的解决方案:

public static String getLongestNonRepeatingString(String inputStr){
    if(inputStr == null){
        return null;
    }

    String maxStr = "";
    String tempStr = "";
    for(int i=0; i < inputStr.length(); i++){
        // 1. if tempStr contains new character, then change tempStr  
        if(tempStr.contains("" + inputStr.charAt(i))){
            tempStr = tempStr.substring(tempStr.lastIndexOf(inputStr.charAt(i)) + 1);
        }
        // 2. add new character
        tempStr = tempStr + inputStr.charAt(i);
        // 3. replace maxStr with tempStr if tempStr is longer
        if(maxStr.length() < tempStr.length()){
            maxStr = tempStr;
        }
    }

    return maxStr;
}

回答by Grijesh Chauhan

I was asked the same question in an interview.

我在一次采访中被问到同样的问题。

I have written Python3code, to find the first occurrenceof the substring with all distinct chars. In my implementations, I start with index = 0 and iterate over the input string. While iterating used a Python dict seemsto store indexes of chars in input-string those has been visited in the iteration.

我已经编写了 Python 3代码,以查找具有所有不同字符的子字符串的第一次出现。在我的实现中,我从 index = 0 开始并遍历输入字符串。虽然迭代使用 Python dictseems将字符索引存储在输入字符串中,这些索引已在迭代中访问过。

In iteration, if char c, does not find in current substring– raise KeyError exception

在迭代中,如果 charc当前子字符串中找不到 - 引发 KeyError 异常

if cfound to be a duplicate char in the current substring (as cpreviously appeared during iteration – named that index last_seen) start a new substring

如果c在当前子字符串中发现重复字符(如c先前在迭代期间出现的 – 命名为 index last_seen)开始一个新的子字符串

def lds(string: str) -> str:
    """ returns first longest distinct substring in input `string` """
    seens = {}
    start, end, curt_start = 0, 0, 0
    for curt_end, c in enumerate(string):
        try:
            last_seen = seens[c]
            if last_seen < curt_start:
                raise KeyError(f"{c!r} not found in {string[curt_start: curt_end]!r}")
            if end - start <  curt_end - curt_start:
                start, end = curt_start, curt_end
            curt_start = last_seen + 1
        except KeyError:
            pass
        seens[c] = curt_end
    else: 
        # case when the longest substring is suffix of the string, here curt_end
        # do not point to a repeating char hance included in the substring
        if string and end - start <  curt_end - curt_start + 1:
            start, end = curt_start, curt_end + 1
    return string[start: end]

回答by shubhranshu

Longest substring without repeating character in python

python中没有重复字符的最长子字符串

public int lengthOfLongestSubstring(String s) {
    if(s.equals(""))
        return 0;
    String[] arr = s.split("");
    HashMap<String,Integer> map = new HashMap<>();
    Queue<String> q = new LinkedList<>();

    int l_till = 1;
    int l_all = 1;
    map.put(arr[0],0);
    q.add(arr[0]);
    for(int i = 1; i < s.length(); i++){
        if (map.containsKey(arr[i])) {
            if(l_till > l_all){
                l_all = l_till;
            }
            while(!q.isEmpty() && !q.peek().equals(arr[i])){
                map.remove(q.remove());
            }
            if(!q.isEmpty())
               map.remove(q.remove());
            q.add(arr[i]);
            map.put(arr[i],i);
            //System.out.println(q);
            //System.out.println(map);
            l_till = q.size();
        }
        else {
            l_till = l_till + 1;
            map.put(arr[i],i);
            q.add(arr[i]);
        }
    }
    if(l_till > l_all){
                l_all = l_till;
            }
    return l_all;
}

回答by Sudhakar Kalmari

enter image description heresimple python snippet l=length p=position maxl=maxlength maxp=maxposition

在此处输入图片说明简单的python片段 l=length p=position maxl=maxlength maxp=maxposition

回答by Raja Rao

Algorithm in JavaScript (w/ lots of comments)..

JavaScript 中的算法(带有大量注释)。

/**
 Given a string S find longest substring without repeating characters.
 Example:

 Input: "stackoverflow"
 Output: "stackoverfl"

 Input: "stackoverflowabcdefghijklmn"
 Output: "owabcdefghijklmn"
 */
function findLongestNonRepeatingSubStr(input) {
    var chars = input.split('');
    var currChar;
    var str = "";
    var longestStr = "";
    var hash = {};
    for (var i = 0; i < chars.length; i++) {
        currChar = chars[i];
        if (!hash[chars[i]]) { // if hash doesn't have the char,
            str += currChar; //add it to str
        hash[chars[i]] = {index:i};//store the index of the char
    } else {// if a duplicate char found..
        //store the current longest non-repeating chars. until now
        //In case of equal-length, <= right-most str, < will result in left most str
        if(longestStr.length <= str.length) {
            longestStr = str;
        }
        //Get the previous duplicate char's index
        var prevDupeIndex = hash[currChar].index;

        //Find all the chars AFTER previous duplicate char and current one
        var strFromPrevDupe = input.substring(prevDupeIndex + 1, i);
        //*NEW* longest string will be chars AFTER prevDupe till current char
        str = strFromPrevDupe + currChar;
        //console.log(str);
        //Also, Reset hash to letters AFTER duplicate letter till current char
        hash = {};
        for (var j = prevDupeIndex + 1; j <= i; j++) {
            hash[input.charAt(j)] = {index:j};
        }
    }
  }
  return longestStr.length > str.length ? longestStr : str;
}

//console.log("stackoverflow => " + findLongestNonRepeatingSubStr("stackoverflow"));      
//returns stackoverfl

//console.log("stackoverflowabcdefghijklmn => " + 
findLongestNonRepeatingSubStr("stackoverflowabcdefghijklmn")); //returns owabcdefghijklmn

//console.log("1230123450101 => " + findLongestNonRepeatingSubStr("1230123450101")); //    
returns 234501

回答by zoha khan

We can consider all substrings one by one and check for each substring whether it contains all unique characters or not. There will be n*(n+1)/2 substrings. Whether a substirng contains all unique characters or not can be checked in linear time by scanning it from left to right and keeping a map of visited characters. Time complexity of this solution would be O(n^3).`

我们可以一一考虑所有子串,并检查每个子串是否包含所有唯一字符。将有 n*(n+1)/2 个子串。可以通过从左到右扫描并保留访问过的字符的映射,在线性时间内检查子搅拌是否包含所有唯一字符。此解决方案的时间复杂度为 O(n^3)。`

import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;


public class LengthOfLongestSubstringWithOutRepeatingChar {
 public static void main(String[] args)
 {
 String s="stackoverflow"; 
 //allSubString(s);
 System.out.println("result of find"+find(s));
 }
 public static String find(String s)
 {
  List<String> allSubsring=allSubString(s);
  Set<String> main =new LinkedHashSet<String>();
  for(String temp:allSubsring)
  {
   boolean a = false;
   for(int i=0;i<temp.length();i++)
   { 
    for(int k=temp.length()-1;k>i;k--)
    {
     if(temp.charAt(k)==temp.charAt(i))
      a=true;
    }
   }
   if(!a)
   {
    main.add(temp);
   }
  }
  /*for(String x:main)
  {
  System.out.println(x); 
  }*/
  String res=null;
  int min=0,max=s.length();
  for(String temp:main)
  {
  if(temp.length()>min&&temp.length()<max)
  {
   min=temp.length();
   res=temp;
  }
  }
  System.out.println(min+"ha ha ha"+res+"he he he");
  return res;
  
 }
 //substrings left to right ban rahi hai

private static List<String> allSubString(String str) {
 List<String> all=new ArrayList<String>();
 int c=0;
 for (int i = 0; i < str.length(); i++) {
  for (int j = 0; j <= i; j++) {
   if (!all.contains(str.substring(j, i + 1)))
   {
    c++;
    all.add(str.substring(j, i + 1));
   }
  }
 }
 for(String temp:all)
 {
  System.out.println("substring :-"+temp);
 }
 System.out.println("count"+c);
 return all;
}
}