Programming Interview Questions 8: Transform Word

Given a source word, target word and an English dictionary, transform the source word to target by changing/adding/removing 1 character at a time, while all intermediate words being valid English words. Return the transformation chain which has the smallest number of intermediate words.

This is again one of my favorite interview questions. It demonstrates how we can formulate a particular problem differently and solve the new problem easily and elegantly. Here, we can formulate the question as a graph problem and the solution will turn out to be a simple breadth first search.

We should perform preprocessing on the given dictionary. Let all the words in the dictionary be the nodes of a graph, and there is an undirected edge between two nodes if one word could be converted to another by a single transformation: changing a character, removing a character, or deleting a character. For example there will be an edge between “bat” and “cat” since we can change a character, also between “cat” and “at” by removing a character, and between “bat” and “bart” by adding a character. We can preprocess the whole dictionary and create this huge graph. Assuming that our program will be asked to transform different words to one another many times,  the cost of precomputation will be negligible. Here is the code to create the graph from the dictionary, the graph is a hashtable where the key is a word and the value is the list of valid transformations of that word:

def constructGraph(dictionary):
    graph=collections.defaultdict(list)
    letters=string.lowercase
    for word in dictionary:
        for i in range(len(word)):
            #remove 1 character
            remove=word[:i]+word[i+1:]
            if remove in dictionary:
                graph[word].append(remove)
            #change 1 character
            for char in letters:
                change=word[:i]+char+word[i+1:]
                if change in dictionary and change!=word:
                    graph[word].append(change)
        #add 1 character
        for i in range(len(word)+1):
            for char in letters:
                add=word[:i]+char+word[i:]
                if add in dictionary:
                    graph[word].append(add)
 
    return graph

Now we have the graph where each edge corresponds to a valid transformation between words. So, given two words we can initiate a breadth first search from the start node and once we reach the goal node we will have the shortest chain of transformations between two words (note that either start or goal or both words may not exist in the dictionary at all). Breadth first search gives the shortest path between a start node and a goal node in an unweighted graph, given that the start node is the root of the search. We didn’t perform depth first search because it won’t necessarily give us the shortest path and we may waste a lot of time trying to explore a dead end because the graph contains many nodes. Here is the code, it takes the graph generated above as an input together with start and goal words:

def transformWord(graph, start, goal):
    paths=collections.deque([ [start] ])
    extended=set()
    #Breadth First Search
    while len(paths)!=0:
        currentPath=paths.popleft()
        currentWord=currentPath[-1]
        if currentWord==goal:
            return currentPath
        elif currentWord in extended:
            #already extended this word
            continue
 
        extended.add(currentWord)
        transforms=graph[currentWord]
        for word in transforms:
            if word not in currentPath:
                #avoid loops
                paths.append(currentPath[:]+[word])
 
    #no transformation
    return []

The complexity of the algorithm depends on how far apart two words are in the graph. In the worst case, we may have to traverse all the graph to find a transformation, or such a transformation may not exist at all, meaning the graph has more than 1 connected components. Or either the source or the goal or both may not exist in the dictionary.

Here is an example, let’s say we have the following words in our dictionary: cat, bat, bet, bed, at, ad, ed. The transformation graph is the following:

Let’s say we want to find the shortest transformation between the words cat and bed. Here is the demonstration of how it works:

There are two paths. cat->bat->bet->bed is shorter than cat->at->ad->ed->bed. If we delete the node ‘bet’ than the first path becomes invalid and the shortest path is now the second one. If we also delete ‘ad’ then there is no path left.

This question demonstrates the power of graphs and formulating a problem as a graph can reduce a difficult task to a classic traversal algorithm.

VN:F [1.9.22_1171]
Rating: 9.6/10 (43 votes cast)
Programming Interview Questions 8: Transform Word, 9.6 out of 10 based on 43 ratings
This entry was posted in Programming Interview. Bookmark the permalink.
  • Ilke

    Hey,
    You may want to load the dictionary into a Trie here, rather than a graph as you described. http://en.wikipedia.org/wiki/Trie

    • Arden

      Trie won’t work because we can change or remove characters. For example, in a trie there won’t be a direct link from cat to at, but we need that link here. And there won’t be a link between cat and bat as well. Also in a trie intermediate nodes are not guaranteed to be actual words. Since trie is a tree it can’t contain backward links, so it would work only if we were allowed to add characters, without removing or changing them.

      • Ilke

        I didn’t suggest loading the dictionary into a trie and then using it as it were graph :)
        Yes, if you use a Trie, every time you pop an element from the queue, you have to find all of its one-step-away neighbors on the fly, however this lookup is very cheap in a trie. Furthermore, you don’t have to do full lookups for every possible neighboring word for additions / changes (for removal you have to, but that’s only length – 1 lookups), the trie will guide you instead with non-null pointers.
        English language has more than 1 million words and it’s expanding by ~10k words each year. Plus, source and target word inputs are bounded by the dictionary. Yes, this solution’s lookups are faster on paper, but because of these two facts, I would go with a trie based solution rather than creating that huge hashtable, especially because it would be more efficient in practice.

  • ANONYMOUS

    this is just like the classic editing distance problem. build dictionary graph is unnecessary.

    • Arden

      Classic edit distance solution would just give you a path, not necessarily the shortest path. Also precomputing the graph would be beneficial if we were asked many transformations. Otherwise we will exhaustively try all edits, most of which are not valid english words.

      • ANONYMOUS

        please take a look at my implementation, basically I will check each possible intermediate word in dictionary.

        here is my code :
        https://gist.github.com/1598920

    • Sorrowfull Blinger

      This problem is slightly different from classic Edit Distance .The intermediate transitions have to be valid words ,whereas its not the same in the classical version of the Edit Distance problem.

  • Sachin

    You check whether we reached target word after popping out the word from queue. But if you do the check before pushing it to the stack, then we can avoid unnecessary pushes into the queue.

    Suppose if we have a level in the graph which has 100s of elements and the first element is the word that we are looking for, then we do not have to push the 100 elements.
    [sourcecode language=”python”]

    for word in transforms:
    if word == goal:
    return currentPath[:]+[word]
    if word not in currentPath:
    #avoid loops
    paths.append(currentPath[:]+[word])


    [/sourcecode]

    • Arden

      Yes you’re right, it’s a nice optimization. Thanks a lot for the comment.

  • Anonymous

    I think this can be done in a better manner using the Levenshtein distance approach, (http://en.wikipedia.org/wiki/Levenshtein_distance) and then backtracking the two dimensional array along the optimal path to compute the intermediate words.

    • sonesh

      thank you for your alternative, but i think, we also have make sure that all intermediate word should be their in dictionary which i think can’t be maintained while calculating Levenshtein distance ..!!!

  • Fatih

    Here is the C# version

    class Graph

    {

    Hashtable graph = new Hashtable();

    public Graph(List englishDictionary)

    {

    foreach (string word in englishDictionary)

    {

    Object nodeObject = graph[word];

    GraphNode node = null;

    if (nodeObject == null)

    {

    node = new GraphNode(word);

    graph.Add(word, node);

    }

    else

    {

    node = (GraphNode)nodeObject;

    }

    //check the added truncated ones

    for (int i = 0; i < word.Length-1; i++)

    {

    string truncated = word.Substring(0, i) + word.Substring(i + 1);

    if (englishDictionary.Contains(truncated))

    {

    Object nodeObject2 = graph[truncated];

    GraphNode node2 = null;

    if (nodeObject2 == null)

    {

    node2 = new GraphNode(truncated);

    graph.Add(truncated, node2);

    }

    else

    {

    node2 = (GraphNode)nodeObject2;

    }

    node.connect(node2);

    Console.WriteLine("[Truncated]Connected those two: " + word + ", " + truncated);

    }

    }

    //check the addedOnes

    string allEnglishCharacters = "abcdefghijklmnopqrstuvwxyz";

    List allEnglishCharacterList = allEnglishCharacters.ToList();

    foreach (char c in allEnglishCharacterList)

    {

    for (int i = 0; i <= word.Length; i++)

    {

    String addedString = word.Substring(0, i) + c + word.Substring(i);

    if (englishDictionary.Contains(addedString))

    {

    Object nodeObject2 = graph[addedString];

    GraphNode node2 = null;

    if (nodeObject2 == null)

    {

    node2 = new GraphNode(addedString);

    graph.Add(addedString, node2);

    }

    else

    {

    node2 = (GraphNode)nodeObject2;

    }

    node.connect(node2);

    Console.WriteLine("[Added] Connected those two: " + word + ", " + addedString);

    }

    }

    }

    //Check the changes

    foreach (char c in allEnglishCharacterList)

    {

    for (int i = 0; i < word.Length; i++)

    {

    String changedString = word.Substring(0, i) + c + word.Substring(i+1);

    if (changedString.Equals(word))

    {

    continue;

    }

    if (englishDictionary.Contains(changedString))

    {

    Object nodeObject2 = graph[changedString];

    GraphNode node2 = null;

    if (nodeObject2 == null)

    {

    node2 = new GraphNode(changedString);

    graph.Add(changedString, node2);

    }

    else

    {

    node2 = (GraphNode)nodeObject2;

    }

    node.connect(node2);

    Console.WriteLine("[Changed] Connected those two: " + word + ", " + changedString);

    }

    }

    }

    }

    }

    public void showTheShortestPath(string source, string target)

    {

    GraphNode startNode = (GraphNode)graph[source];

    if (startNode == null)

    {

    Console.WriteLine("No words found in the dictionary for input: " + source);

    return;

    }

    foreach (GraphNode node in startNode.ConnectedNodes)

    {

    node.ParentInTraverse = startNode;

    }

    String result = traverseConnectedNodes(startNode, new List() { startNode }, target);

    if (result == null)

    {

    Console.WriteLine(“Cannot reach to target”);

    }

    Console.WriteLine(result);

    }

    private String traverseConnectedNodes(GraphNode startNode, List visitedNodes, string target){

    if (startNode.ConnectedNodes.Count == 0)

    {

    return null;

    }

    List children = new List();

    foreach (GraphNode childrenNode in startNode.ConnectedNodes)

    {

    if (!visitedNodes.Contains(childrenNode))

    {

    visitedNodes.Add(childrenNode);

    if (childrenNode.Value.Equals(target))

    {

    Console.WriteLine(“Found the target”);

    return startNode.Value+”->”+ childrenNode.Value;

    }

    Console.WriteLine(“Adding childrenNode: ” + childrenNode.Value);

    string result = traverseConnectedNodes(childrenNode, visitedNodes, target);

    if (result != null)

    {

    return startNode.Value + “->” + result;

    }

    }

    }

    return null;

    }

    }

    • Fatih

      I think I should find a better way to share code here

      • Ashish

        HI Faith,

        I tried to compile your code but it gives me error at Game Node. I don’t understand what does Game Node mean? Can you pls help this out.

      • Ladan

        Hi Fatih,
        Nice job. May I have the description and definition for the types GraphNode, and graphnode. Thanks.

  • AIR20

    How did you generate the graph with blue circles in the post? It looks very nice. I’d like to make such a graph too.

  • Kamal Chaya

    What is this line for?
    “string.lowercase”

    • Wanta Cheng

      >>> import string
      >>> letters = string.lowercase
      >>> print letters
      >>> ‘abcdefghijklmnopqrstuvwxyz’

  • holdnet

    Nice Solution!

  • Šimon

    Since the graph is undirected (or directed with both edges always present in case of connection), we can save some time in the graph building phase (or to be precise the relation of transforming is symetric).
    If we remove a letter from word (bart -> bat), then we know that by adding the same letter we get the same word (bar -> bart), therefore we can make two insertion into our graph just in phase where we add or delete letter (since adding a letter is more expensive than deleting one I suggest doing just deletion).
    For transforming a letter we can save 1/2 of time by checking for letters eigther before the original letter or after (in alphabet). When we find match, we just make to insertions like:
    graph[word].append(change)
    graph[change].append(word)

  • John D

    Ok, very nice question. I like it more when I see the solution. However, can you guys solve this question, if it appears in an interview, if you do not know this question? Can you think of graph based solution, if you have 30-35 minutes total time and less than 10 minutes for deciding how to solve? I first think of edit distance, probably I would fail this question, if the interviewer does not provide a hint for solving by graph.