string KMP前缀表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13792118/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:44:55  来源:igfitidea点击:

KMP prefix table

stringalgorithmdata-structurespattern-matching

提问by Cratylus

I am reading about KMPfor string matching.
It needs a preprocessing of the pattern by building a prefix table.
For example for the string ababacathe prefix table is: P = [0, 0, 1, 2, 3, 0, 1]
But I am not clear on what does the numbers show. I read that it helps to find matches of the pattern when it shifts but I can not connect this info with the numbers in the table.

我正在阅读有关KMP字符串匹配的信息。
它需要通过构建前缀表对模式进行预处理。
例如对于字符串ababaca前缀表是: P = [0, 0, 1, 2, 3, 0, 1]
但我不清楚数字显示什么。我读到它有助于在移动时找到模式的匹配项,但我无法将此信息与表中的数字联系起来。

回答by imslavko

Every number belongs to corresponding prefix ("a", "ab", "aba", ...) and for each prefix it represents length of longest suffix of this string that matches prefix. We do not count whole string as suffix or prefix here, it is called self-suffix and self-prefix (at least in Russian, not sure about English terms).

每个数字都属于相应的前缀(“a”、“ab”、“aba”、...),并且对于每个前缀,它表示该字符串与前缀匹配的最长后缀的长度。这里我们不把整个字符串算作后缀或前缀,它被称为自后缀和自前缀(至少在俄语中,不确定英文术语)。

So we have string "ababaca". Let's look at it. KMP computes Prefix Function for every non-empty prefix. Let's define s[i]as the string, p[i]as the Prefix function. prefix and suffix may overlap.

所以我们有字符串“ababaca”。让我们来看看。KMP 为每个非空前缀计算前缀函数。让我们定义s[i]为字符串,p[i]作为前缀函数。前缀和后缀可能会重叠。

+---+----------+-------+------------------------+
| i |  s[0:i]  | p[i]  | Matching Prefix/Suffix |
+---+----------+-------+------------------------+
| 0 | a        |     0 |                        |
| 1 | ab       |     0 |                        |
| 2 | aba      |     1 | a                      |
| 3 | abab     |     2 | ab                     |
| 4 | ababa    |     3 | aba                    |
| 5 | ababac   |     0 |                        |
| 6 | ababaca  |     1 | a                      |
|   |          |       |                        |
+---+----------+-------+------------------------+

Simple C++ code that computes Prefix function of string S:

计算字符串 S 的 Prefix 函数的简单 C++ 代码:

vector<int> prefixFunction(string s) {
    vector<int> p(s.size());
    int j = 0;
    for (int i = 1; i < (int)s.size(); i++) {
        while (j > 0 && s[j] != s[i])
            j = p[j-1];

        if (s[j] == s[i])
            j++;
        p[i] = j;
    }   
    return p;
}

回答by Yogesh Sanchihar

This code may not be the shortest, but easy to understand flow of code. Simple Java Code for calculating prefix-Array-

这段代码可能不是最短的,但很容易理解的代码流。用于计算前缀数组的简单 Java 代码

    String pattern = "ababaca";
    int i = 1, j = 0;
    int[] prefixArray = new int[pattern.length];
    while (i < pattern.length) {

        while (pattern.charAt(i) != pattern.charAt(j) && j > 0) {
            j = prefixArray[j - 1];

        }
        if (pattern.charAt(i) == pattern.charAt(j)) {
            prefixArray[i] = j + 1;
            i++;
            j++;

        } else {
            prefixArray[i] = j;
            i++;
        }
    }

    for (int k = 0; k < prefixArray.length; ++k) {
        System.out.println(prefixArray[k]);
    }

It produces the required output-

它产生所需的输出 -

0 0 1 2 3 0 1

0 0 1 2 3 0 1

回答by Mahesh Wakade

Python Implementation

Python 实现

p='ababaca'

l1 = len(p)

j = 0
i = 1
prefix = [0]

while len(prefix) < l1:
    if p[j] == p[i]:
        prefix.append(j+1)
        i += 1
        j += 1
    else:
        if j == 0:
            prefix.append(0)
            i += 1
        if j != 0:
            j = prefix[j-1]

print prefix

回答by here4learn

I have tried my hands using the Javascript, Open for suggestions.

我已经尝试过使用 Javascript,请打开以获取建议。

const prefixArray = function (p) {
let aux = Array(p.length).fill(0);

// For index 0 the matched index will always be 0, so we will we start from 1
let i = 1;
let m = 0; // mismatched index will be from 0th

// run the loop on pattern length
while ( i < p.length) {

    // 3 Cases here
    // First when we have a match of prefix and suffix of pattern
    if(p.charAt(i) === p.charAt(m)) {
        // increment m
        m++;
        // update aux index
        aux[i] = m;
        // update the index.
        i++;
    } 
    // Now if there is no match and m !=0 means some match happened previously
    // then we need to move back M to that index
    else if(p.charAt(i) !== p.charAt(m) && m !== 0) {
        m = aux[m-1];
        // we dont want to increment I as we want to start comparing this suffix with previous matched
    } else {
        // if none of the above conditions then
        // just update the current index in aux array to 0
        aux[i] = 0; // no match
        i++; // shift to the next char
    }
}

return aux; 
}

回答by Pankaj Gupta

    String string = "abababca";
    int[]array = new int[string.length()];

    int i = 1;
    int j = 0;

    while(i<string.length()) {
        // if the character are matching the increment the j and i 
        if(string.charAt(j)==string.charAt(i)) {
            array[i] = array[i-1]+1;
            i++;
            j++;
        }else {

            // if not then move j to array[j-1] position and increment i 
            if(j!=0) {
                j = array[j-1];
            }
            i++;
        }   
    }

    for(int k :array) {
        System.out.print(k+" ");
    }

回答by MOHAMED SABTHAR

string text = "ababbabbababbababbabb"; static int arr[30];

string text = "abbabbabbababbabbabbabbabb"; 静态 int arr[30];

int i = 1;
while (i < text.length())
{
    int j = 0;
    int value = 0;
    while (((i + j) < text.length()) && (text[j] == text[i + j]))
        val[i + j] = ++value, j++;
    i += j + 1;
}

required output stored in val[]

存储在 val[] 中的所需输出