Minimum Window Substring

Question statement

Given two stringsĀ sĀ andĀ tĀ of lengthsĀ mĀ andĀ nĀ respectively, returnĀ theĀ minimum windowĀ 

substringĀ ofĀ sĀ such that every character inĀ tĀ (including duplicates) is included in the window. If there is no such substring, returnĀ the empty stringĀ "".

The testcases will be generated such that the answer isĀ unique.

Example:

Input: s = "ADOBECODEBANC", t = "ABC"
Output: "BANC"
Explanation: The minimum window substring "BANC" includes 'A', 'B', and 'C' from string t.
Input: s = "a", t = "a"
Output: "a"
Explanation: The entire string s is the minimum window.
Input: s = "a", t = "aa"
Output: ""
Explanation: Both 'a's from t must be included in the window.
Since the largest window of s only has one 'a', return empty string.

Solution

In order to do this we have to have 2 slow and fast pointer:

  1. First we need to increase fast to find a substring that contains all the leter from string t first
  2. And then we're going to truncate from the slow pointer until we find the shortest string that is still has a valid string t
  3. When it's invalid, we increase our fast pointer again until it's valid

Implementation


class App:
    def minWindow(self, s: str, t: str) -> str:
        if len(t) > len(s):
            return ""

        currentWindowFreq, substringFreq = Counter(), Counter(t)
        slow, fast = 0, 0

        result = ""
        while fast < len(s):
            if s[fast] in t:
                currentWindowFreq[s[fast]] += 1
            
            while slow <= fast and self.isValidSubstring(currentWindowFreq, substringFreq):
                if (result == "") or (fast - slow + 1 <= len(result)):
                    result = s[slow: fast + 1]
                
                # Starting dropping from slow
                if s[slow] in t:
                    currentWindowFreq[s[slow]] -= 1
                slow += 1
            
            fast += 1

        return result

    def isValidSubstring(self, currentWindowFreq: Counter, substringFreq: Counter) -> bool:
        for k in substringFreq:
            if (currentWindowFreq[k] < substringFreq[k]):
                return False
        return True

Time complexity: $O(2n + 26)$

  • For each letter we need to visit twice (1 for fast and 1 for slow) pointers which takes $O(2n)$
  • To check if a string is valid, we then need to cross check with our substringFreq which will be $O(26)$ becauseof alphabetically

Space complexity: $O(26 \times 2)$ to store 2 dictionaries of frequency:

  • One for the current substring window
  • One for the given t frequency

Optimization

In here, we can optimize our isValidSubstring() function. Instead of going through the whole list every single time, we can do a "lazy-calculating".

So if we keep track of a variable metRequirement to check if we've all the fields have validated.

So if
$$ metRequirement = len(substringFreq) $$
means we have a valid substring.

Notice that the currentWindowFreq only get updated when we add or remove from fast or slow pointers. As a result, we have:

class AppOptimized:
    def minWindow(self, s: str, t: str) -> str:
        if len(t) > len(s):
            return ""

        currentWindowFreq, substringFreq = Counter(), Counter(t)
        slow, fast = 0, 0
        metRequirement = 0

        result = ""
        while fast < len(s):
            if s[fast] in t:
                currentWindowFreq[s[fast]] += 1
            
                if currentWindowFreq[s[fast]] == substringFreq[s[fast]]:
                    metRequirement += 1
            
            while slow <= fast and metRequirement == len(substringFreq):
                if (result == "") or (fast - slow + 1 <= len(result)):
                    result = s[slow: fast + 1]
                
                # Starting dropping from slow
                if s[slow] in t:
                    currentWindowFreq[s[slow]] -= 1
                
                    if currentWindowFreq[s[slow]] < substringFreq[s[slow]]:
                        # Worst come to worst it can only be have a difference of `1` because the `currentWindowFreq[s[fast]]` 
                        # will add up above so it won't become negative, and also when metRequirement = len(substringFreq) - 1 the loop will break anyways
                        metRequirement -= 1
                slow += 1
            
            fast += 1

        return result