string 通过有效词将一个词转换为另一个词的算法

Question

提问by gameover

I came across this variation of edit-distanceproblem:

我遇到了这种编辑距离问题的变体：

Design an algorithm which transforms a source word to a target word. for example: from head to tail, in each step, you just can replace one character, and the word must be valid. You'll be given a dictionary.

设计一种将源词转换为目标词的算法。例如：从头到尾，每一步只能替换一个字符，且该词必须是有效的。你会得到一本字典。

It clearly is a variation of the edit distanceproblem, but in edit distance I do not care about if the word is valid or not. So how do I add this requirement to edit distance.

这显然是编辑距离问题的变体，但在编辑距离中，我不在乎这个词是否有效。那么我如何添加这个要求来编辑距离。

Answer 1

回答by codaddict

This can be modelled as a graph problem. You can think of the words as nodes of the graph and two nodes are connected if and only if they are of same length and differ in one char.

这可以建模为图问题。您可以将单词视为图形的节点，并且当且仅当它们具有相同的长度并且在一个字符中不同时，两个节点是连接的。

You can preprocess the dictionary and create this graph, should look like:

您可以预处理字典并创建此图，应如下所示：

   stack  Hyman
    |      |
    |      |
   smack  back -- pack -- pick

You can then have a mapping from the word to the node representing the word, for this you can use a hash table, height balanced BST ...

然后，您可以从单词到表示单词的节点进行映射，为此您可以使用哈希表，高度平衡 BST ...

Once you have the above mapping in place, all you have to do is see if there exists a path between the two graph nodes, which can easily be done using BFS or DFS.

一旦你有了上面的映射，你所要做的就是查看两个图节点之间是否存在路径，这可以使用 BFS 或 DFS 轻松完成。

So you can summarize the algorithm as:

因此，您可以将算法总结为：

preprocess the dictionary and create the graph.
Given the two inputs words w1 and w2
if length(w1) != length(w2)
 Not possible to convert
else
 n1 = get_node(w1)
 n2 = get_node(w2)

 if(path_exists(n1,n2))
   Possible and nodes in the path represent intermediary words
 else
   Not possible

Answer 2

回答by Nick Johnson

codaddict's graph approach is valid, though it takes O(n^2) time to build each graph, where n is the number of words of a given length. If that's a problem, you can build a bk-treemuch more efficiently, which makes it possible to find all words with a given edit distance (in this case, 1) of a target word.

codacci 的图方法是有效的，尽管构建每个图需要 O(n^2) 时间，其中 n 是给定长度的单词数。如果这是一个问题，您可以更有效地构建bk 树，这使得找到目标词的给定编辑距离（在本例中为 1）的所有词成为可能。

Answer 3

回答by prasadvk

Create a graph with each node representing word in the dictionary. Add an edge between two word nodes, if their corresponding words are at edit distance of 1. Then minimum number of transformations required would length of shortest path between source node and destination node.

创建一个图，每个节点代表字典中的单词。在两个词节点之间添加一条边，如果它们对应的词的编辑距离为 1。那么所需的最小转换次数将是源节点和目标节点之间的最短路径长度。

Answer 4

回答by Pradeep Vairamani

You could simply use recursive back-tracking but this is far from the most optimal solution.

您可以简单地使用递归回溯，但这远不是最佳解决方案。

# Given two words of equal length that are in a dictionary, write a method to transform one word into another word by changing only
# one letter at a time.  The new word you get in each step must be in the
# dictionary.

# def transform(english_words, start, end):

# transform(english_words, 'damp', 'like')
# ['damp', 'lamp', 'limp', 'lime', 'like']
# ['damp', 'camp', 'came', 'lame', 'lime', 'like']


def is_diff_one(str1, str2):
    if len(str1) != len(str2):
        return False

    count = 0
    for i in range(0, len(str1)):
        if str1[i] != str2[i]:
            count = count + 1

    if count == 1:
        return True

    return False


potential_ans = []


def transform(english_words, start, end, count):
    global potential_ans
    if count == 0:
        count = count + 1
        potential_ans = [start]

    if start == end:
        print potential_ans
        return potential_ans

    for w in english_words:
        if is_diff_one(w, start) and w not in potential_ans:
            potential_ans.append(w)
            transform(english_words, w, end, count)
            potential_ans[:-1]

    return None


english_words = set(['damp', 'camp', 'came', 'lame', 'lime', 'like'])
transform(english_words, 'damp', 'lame', 0)

Answer 5

回答by Yuliy

I don't think this is edit distance.

我不认为这是编辑距离。

I think this can be done using a graph. Just construct a graph from your dictionary, and attempt to navigate using your favorite graph traversal algorithm to the destination.

我认为这可以使用图表来完成。只需从您的字典中构建一个图形，然后尝试使用您最喜欢的图形遍历算法导航到目的地。

Answer 6

回答by Boris

I don't think we need graph or some other complicated data structure. My idea is to load the dictionary as a HashSetand use contains()method to find out if the word exists in the dictionary or not.

我认为我们不需要图形或其他一些复杂的数据结构。我的想法是将字典作为 a 加载HashSet并使用contains()方法来找出字典中是否存在该单词。

Please, check this pseudocodeto see my idea:

请检查此伪代码以了解我的想法：

Two words are given: START and STOP. 
//List is our "way" from words START to STOP, so, we add the original word to it first.
    list.add(START);
//Finish to change the word when START equals STOP.
    while(!START.equals(STOP))
//Change each letter at START to the letter to STOP one by one and check if such word exists.
    for (int i = 0, i<STOP.length, i++){
        char temp = START[i];
        START[i] = STOP[i];
//If the word exists add a new word to the list of results. 
//And change another letter in the new word with the next pass of the loop.
        if dictionary.contains(START)
           list.add(START)
//If the word doesn't exist, leave it like it was and try to change another letter with the next pass of the loop.
        else START[i] = temp;}
    return list;

As I understand my code should work like that:

据我了解，我的代码应该像这样工作：

Input: DAMP, LIKE

输入: DAMP, LIKE

Output: DAMP, LAMP, LIMP, LIME, LIKE

输出：DAMP、LAMP、LIMP、LIME、LIKE

Input: BACK, PICK

输入：返回，选择

Output: BACK, PACK, PICK

输出：返回、打包、拾取

Answer 7

回答by Muhammad Soliman

This is C# code to solve the problem using BFS:

这是使用 BFS 解决问题的 C# 代码：

//use a hash set for a fast check if a word is already in the dictionary
    static HashSet<string> Dictionary = new HashSet<string>();
    //dictionary used to find the parent in every node in the graph and to avoid traversing an already traversed node
    static Dictionary<string, string> parents = new Dictionary<string, string>();

    public static List<string> FindPath(List<string> input, string start, string end)
    {
        char[] allcharacters = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'};

        foreach (string s in input)
            Dictionary.Add(s);
        List<string> currentFrontier = new List<string>();
        List<string> nextFrontier = new List<string>();
        currentFrontier.Add(start);
        while (currentFrontier.Count > 0)
        {
            foreach (string s in currentFrontier)
            {
                for (int i = 0; i < s.Length; i++)
                {
                    foreach (char c in allcharacters)
                    {
                        StringBuilder newWordBuilder = new StringBuilder(s);
                        newWordBuilder[i] = c;
                        string newWord = newWordBuilder.ToString();
                        if (Dictionary.Contains(newWord))
                        {
                            //avoid traversing a previously traversed node
                            if (!parents.Keys.Contains(newWord))
                            {
                                parents.Add(newWord.ToString(), s);
                                nextFrontier.Add(newWord);
                            }

                        }
                        if (newWord.ToString() == end)
                        {
                            return ExtractPath(start, end);

                        }
                    }
                }
            }
            currentFrontier.Clear();
            currentFrontier.Concat(nextFrontier);
            nextFrontier.Clear();
        }
        throw new ArgumentException("The given dictionary cannot be used to get a path from start to end");
    }

    private static List<string> ExtractPath(string start,string end)
    {
        List<string> path = new List<string>();
        string current = end;
        path.Add(end);
        while (current != start)
        {
            current = parents[current];
            path.Add(current);
        }
         path.Reverse();
         return path;
    }

Answer 8

回答by Piyush Datani

class Solution {
    //static int ans=Integer.MAX_VALUE;
    public int ladderLength(String beginWord, String endWord, List<String> wordList) {
        HashMap<String,Integer> h=new HashMap<String,Integer>();
        HashMap<String,Integer> h1=new HashMap<String,Integer>();
        for(int i=0;i<wordList.size();i++)
        {
            h1.put(wordList.get(i),1);
        }
        int count=0;
        Queue<String> q=new LinkedList<String>();
        q.add(beginWord);
        q.add("-1");
        h.put(beginWord,1);
        int ans=ladderLengthUtil(beginWord,endWord,wordList,h,count,q,h1);
        return ans;
    }
    public int ladderLengthUtil(String beginWord, String endWord, List<String> wordList,HashMap<String,Integer> h,int count,Queue<String> q,HashMap<String,Integer> h1)
    {  
        int ans=1;
        while(!q.isEmpty()) 
        {
            String s=q.peek();
            q.poll();
            if(s.equals(endWord))
            {
                return ans;   
            }
            else if(s.equals("-1"))
            {
                if(q.isEmpty())
                {                    
                    break;
                }
                ans++;                
                q.add("-1");
            }
            else
            {
                for(int i=0;i<s.length();i++)
                {
                        for(int j=0;j<26;j++)
                        {
                            char a=(char)('a'+j);
                            String s1=s.substring(0,i)+a+s.substring(i+1);
                            //System.out.println("s1 is "+s1);
                            if(h1.containsKey(s1)&&!h.containsKey(s1))
                            {
                                h.put(s1,1);
                                q.add(s1);
                            }
                        }
                }
            }
        }
        return 0;    
    }
}

Answer 9

回答by ChampCoda

This is clearly a permutation problem. Using a graph is overkill. The problem statment is missing one important constraint; that you can change each position only once. This then makes it implicit that the solution is within 4 steps. Now all that needs to be decided is the sequence of the replace operations:

这显然是一个排列问题。使用图表是矫枉过正的。问题陈述缺少一个重要的约束；您只能更改每个位置一次。这就暗示了解决方案在 4 个步骤之内。现在需要决定的是替换操作的顺序：

Operation1 = change "H" to "T"
Operation2 = change "E" to "A"
Operation3 = change "A" to "I"
Operation4 = change "D to "L"

操作1 = 将“H”更改为“T”
操作2 = 将“E”更改为“A”
操作3 = 将“A”更改为“I”
操作4 = 将“D”更改为“L”

The solution, the sequence of operations, is some permutation of the string "1234", where each digit represents the position of the character being replaced. e.g. "3124" indicates that first we apply operation3, then operation1, then operation2, then operation 4. At each step, if resulting word is not in dictionary, skip to next permutation. Reasonably trivial. Code anyone?

解决方案，即操作序列，是字符串“1234”的某种排列，其中每个数字代表被替换字符的位置。例如“3124”表示我们首先应用操作3，然后应用操作1，然后操作2，然后操作4。在每一步，如果结果词不在字典中，则跳到下一个排列。相当微不足道。编码任何人？

string 通过有效词将一个词转换为另一个词的算法

提问by gameover

回答by codaddict

回答by Nick Johnson

回答by prasadvk

回答by Pradeep Vairamani

回答by Yuliy

回答by Boris

回答by Muhammad Soliman

回答by Piyush Datani

回答by ChampCoda

相关推荐

最近更新

标签

string 通过有效词将一个词转换为另一个词的算法

提问by gameover

回答by codaddict

回答by Nick Johnson

回答by prasadvk

回答by Pradeep Vairamani

回答by Yuliy

回答by Boris

回答by Muhammad Soliman

回答by Piyush Datani

回答by ChampCoda

相关推荐

如何在 Oracle 中增加 TEMP TABLE Space 值？

string Delphi，将字符串复制到字节数组

oracle ORA-00917: 缺少逗号错误

string 如何在 Delphi 中检查字符串是否是有效的日期时间格式字符串

相关推荐

最近更新

标签