Edit Distance
Determine the Edit Distance in a Word Ladder: Given two words (beginWord and endWord), and a dictionary's word list, find the minimum number of operations needed to change beginWord into endWord.
You can change only one letter at a time, and each intermediate word must exist in the word list. If there is no possible transformation, return None (Python), -1 (Java & C++), null (Javascript).
Examples
beginWord = 'hit' endWord = 'cog' wordList = ["hit", "hot", "dot", "dog", "cog"] output: 4 # hit -> hot -> dot -> dog -> cog beginWord = 'word' endWord = 'word' wordList = ['word', 'ward'] output: 0 beginWord = 'hit' endWord = 'cog' wordList = ["hit", "hot", "dot", "dog"] output: None # because no 'cog' in list
Our solution tackles the problem by first constructing a graph that represents all possible transformations between words in the list, where an edge exists between two words if they differ by exactly one character (i.e., they are "one edit distance" apart). This is done by iterating over the list of words and comparing each pair to determine if they meet the one edit distance condition. If they do, an edge is added between them in the graph.
After constructing the graph, we use a breadth-first search (BFS) approach to find the shortest path from the start_word to the end_word. We initiate a queue that starts with the start_word and its associated edit distance (initially 0). As we traverse the graph, for each word, we check if it's the end_word. If so, we return the current edit distance incremented by one. If not, we add the word's neighbors to the queue for further exploration, ensuring that we don't revisit previously seen words.
The BFS guarantees that the first time we reach the end_word, it is via the shortest path, and we can stop the search early.
from collections import deque
def update_word(words, start_word, end_word):
# If start and end words are the same, no edits are needed
if start_word == end_word:
return 0
#construct the graph
one_edit_distance_graph = {}
for i in range(len(words)):
base_word = words[i]
for j in range(len(words)):
if i != j:
new_word = words[j]
if is_one_edit_distance(base_word, new_word):
if base_word in one_edit_distance_graph:
one_edit_distance_graph[base_word].append(new_word)
else:
one_edit_distance_graph[base_word] = [new_word]
#traverse graph
stack = deque([(start_word, 0)])
visited_word_set = set()
while stack:
curr_word, edit_distance = stack.popleft()
for word in one_edit_distance_graph.get(curr_word, []):
if word not in visited_word_set:
if word == end_word:
return edit_distance + 1
stack.append((word, edit_distance + 1))
visited_word_set.add(word)
return
def is_one_edit_distance(word1, word2):
if len(word1) != len(word2):
return False
diff_char_count = 0
for i in range(len(word1)):
if word1[i] != word2[i]:
diff_char_count += 1
if diff_char_count > 1:
return False
return diff_char_count == 1Time Complexity: The time complexity of the solution is O(n^2 * m), where n is the number of words and m is the length of each word. This complexity arises from the double loop that compares each word to every other word (O(n^2)), and for each comparison, we check if the words differ by exactly one character (O(m)).
Space Complexity: The space complexity is O(n^2), where n is the number of words. This space is required to store the graph, where each word can potentially have edges to all other words. Additionally, the BFS queue and the visited set require O(n) space in the worst case.
In this video, Thomas, Google SWE, answers the Edit Distance coding interview question.
Definitely nice to think of this without memorization, but there is a well known algorithm for this problem, which is the Levenshtein Distance.
Lev(a,b) = len(a) if len(b) == 0 = len(b) if len(a) == 0 = lev(a[1:], b[1:] if a[0] == b[0] = 1 + min (lev(a, b[1:]), lev(a[1:], b), lev(a[1:], b[1:]))
https://en.wikipedia.org/wiki/Levenshtein_distance
I'm sure some optimizations could be made with heuristic.