Group Anagram
Question
Given an array of strings strs
, group all anagrams together into sublists. You may return the output in any order.
An anagram is a string that contains the exact same characters as another string, but the order of the characters can be different.
Example 1:
Input: strs = ["act","pots","tops","cat","stop","hat"]
Output: [["hat"],["act", "cat"],["stop", "pots", "tops"]]
Example 2:
Input: strs = ["x"]
Output: [["x"]]
Example 3:
Input: strs = [""]
Output: [[""]]
Solution
While it's easy to do a dict[sorted(word)].append(word)
but it will be $O(m * nlogn)$. Where m
is the number of strs
and n
is the longest string.
However in this case we can do $O(m \times n)$
Intuition
The main thing here is that what's the most efficient way to hash beside sorting since sorting is unnecessary
Since we only have 26
letters, we can create an id system where it takes the frequency to build id
Say we have an array of frequency of 26 letters corresponding to the frequency from a -> z
[0,0,0,0,.....,0]
a,b,c,d,......,z
The word abd
would have the id [1,1,0,1,0.....0]
[1,1,0,1,0,0,....0]
a,b,c,d,e,f,....z
This would be the same thing with the word adb
since they are anagram.
As a result, we can use the list here as the id. However, in python you cannot use the list as the key, but you can use Tuple as the key. We just need to convert from tuple to list
Implementation
from collections import defaultdict, Counter
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
result = defaultdict(list)
for w in strs:
result[self.hash(w)].append(w)
return result.values()
def hash(self, word):
frequency = [0] * 26
for c in word:
frequency[ord(c) - ord('a')] += 1
return tuple(frequency)
Time and space complexity
- Time complexity: $O(m \times n)$
- Space complexity: $(m)$
Where $m$ is the longest word and $n$ is the number of words.