Programming Interview Questions 16: Anagram Strings

Given two strings, check if they’re anagrams or not. Two strings are anagrams if they are written using the same exact letters, ignoring space, punctuation and capitalization. Each letter should have the same count in both strings. For example, ‘Eleven plus two’ and ‘Twelve plus one’ are meaningful anagrams of each other.

First we should extract only the letters from both strings and convert to lowercase, excluding punctuation and whitespaces. Then we can compare these to check whether two strings are anagrams of each other. From now on when I refer to a string, I assume this transformation is performed and it only contains lowercase letters in original order.

If two strings contain every letter same number of times, then they are anagrams. One way to perform this check is to sort both strings and check whether they’re the same or not. The complexity is O(NlogN) where N is the number of characters in the string. Here’s the code:

def isAnagram1(str1, str2):
    return sorted(getLetters(str1))==sorted(getLetters(str2))
 
def getLetters(text):
    return [char.lower() for char in text if char in string.letters]

Sorting approach is elegant but not optimal. We would prefer a linear time solution. Since the problem involves counting, hashtable would be a suitable data structure. We can store the counts of each character in string1 in a hashtable. Then we scan string2 from left to right decreasing the count of each letter. Once the count becomes negative (string2 contains more of that character) or if the letter doesn’t exist in the hashtable (string1 doesn’t contain that character), then the strings are not anagrams. Finally we check whether all the counts in the hashtable are 0, otherwise string1 contains extra characters. Or we can check the lengths of the strings in the beginning and avoid this count check. This also allows early termination of the program if the strings are of different lengths, because they can’t be anagrams. The code is the following:

def isAnagram2(str1, str2):
    str1, str2 = getLetters(str1), getLetters(str2)
    if len(str1)!=len(str2):
        return False
    counts=collections.defaultdict(int)
    for letter in str1:
        counts[letter]+=1
    for letter in str2:
        counts[letter]-=1
        if counts[letter]<0:
            return False
    return True

I use python’s defaultdict as hashtable. If a letter doesn’t exist in the dictionary it produces the value of 0. The complexity of this solution is O(N), which is optimal. The use of hashtables in storing counts once again proves its advantage.

Rating: 0.0/10 (0 votes cast)

8 Responses to Programming Interview Questions 16: Anagram Strings

Sameer says:

November 21, 2011 at 7:15 pm

Hey Arden,

Why the last “sum(counts.values())==0″ check is required? If it reaches there, wouldn’t it always be 0?

- Arden says:
  
  November 21, 2011 at 9:34 pm
  
  Because string1 may contain extra characters. For example if string1=’abc’ and string2=’ab’, the loops would complete just fine but we should check for that extra character ‘c’ in the end. To be sure that each letter has exact same count in both strings.
  
  - says:
    
    November 21, 2011 at 9:36 pm
    
    But you already compare the lengths of the strings first.
    
    I mean not a big deal to compare again, I was just making sure that I am not missing any case..
    
    - Arden says:
      
      November 21, 2011 at 9:41 pm
      
      Ok now I see, you’re totally right. Thanks a lot for the notice, I’ll update the code.
      
Volkan says:

January 5, 2012 at 4:38 am

how about using XOR, it reduces memory to O(1).

- Arden says:
  
  January 6, 2012 at 8:07 am
  
  That’s a great idea, thanks for the comment. Will update the post..
  
  - Raj says:
    
    January 22, 2012 at 6:53 pm
    
    xor does’nt work.consider the case of string1=”aaaa” and string2=”bbbb”.for this xor of both string is zero.
    
    - Arden says:
      
      January 22, 2012 at 7:21 pm
      
      You’re right, now I see it. Thanks a lot for the correction!

Programming Interview Questions 16: Anagram Strings

8 Responses to Programming Interview Questions 16: Anagram Strings

Leave a Reply Cancel reply

Subscribe via Email

Recent Posts