Find Median From Data Stream
Question
https://leetcode.com/problems/find-median-from-data-stream/description/
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
- For example, for
arr = [2,3,4]
, the median is3
. - For example, for
arr = [2,3]
, the median is(2 + 3) / 2 = 2.5
.
Implement the MedianFinder class:
MedianFinder()
initializes theMedianFinder
object.void addNum(int num)
adds the integernum
from the data stream to the data structure.double findMedian()
returns the median of all elements so far. Answers within10-5
of the actual answer will be accepted.
Example 1:
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]
Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3); // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
Solution
Intuition
Sorting the array
For this question, it's given a stream (array) of elements and asking us to find the median (the middle number of a sorted array) if the array is even then the median is the average.
So for example if we have in the stream these following numbers
[4,-3,2,0]
Then when it's sorted it becomes
[-3,0,2,4]
Therefore, the median is $\frac{0 + 2}{2} = 1$
In another case when we have 5 numbers
[4,-3,2,0,1]
Then when it's sorted it becomes
[-3,0,1,2,4]
then the median is $1$
Using Heap
Using the heap, we can seperate into 2 sorted array on two side.
When we add into this heap:
- If both heap is empty or
rightHeap
is empty, we add into the left heap (just imagine you fill in the array, you gotta fill from the left) - If the element is larger than the smallest value of the
rightHeap
, we add intorightHeap
To balance we can do
# if len(largeHeap) > len(smallHeap) + 1
heappush(largeHeap, heappop(smallHeap))
# if len(smallHeap) > len(largeHeap) + 1
heappush(smallHeap, heappop(largeHeap))
Then to find the solution, we have 3 scenario
- If
len(largeHeap) > len(smallHeap)
returnlargeHeap[0]
(smallest element oflargeHeap
) - if
len(smallHeap) > len(largeHeap)
returnsmallHeap[0]
(largest element ofsmallHeap
) - if
len(smallHeap) == len(largeHeap)
return $\frac{smallHeap[0] + largeHeap[0]}{2}$
Implementation
import unittest
from typing import *
from heapq import heappush, heappop
class MedianFinder:
def __init__(self):
self.leftHeap = []
self.rightHeap = []
def addNum(self, num: int) -> None:
if self.rightHeap and self.rightHeap[0] < num:
heappush(self.rightHeap, num)
else:
# Add to left by default
heappush(self.leftHeap, num * -1)
self.rebalance()
def findMedian(self) -> float:
if len(self.leftHeap) == len(self.rightHeap):
return (-self.leftHeap[0] + self.rightHeap[0]) / 2
if len(self.rightHeap) > len(self.leftHeap):
return self.rightHeap[0]
return -self.leftHeap[0]
def rebalance(self) -> None:
if len(self.leftHeap) > len(self.rightHeap) + 1:
leftValue = heappop(self.leftHeap) * -1
heappush(self.rightHeap, leftValue)
if len(self.rightHeap) > len(self.leftHeap) + 1:
rightValue = heappop(self.rightHeap)
heappush(self.leftHeap, rightValue * -1)
Time complexity: $O(logn)$
- We do a $logn$ rebalance
Space complexity: $O(n)$ — given $n$ is the number of input
- We have 2 heaps sum up to $n$ size