Median of Two Sorted Arrays [Solution]
Here’s a solution to the “Median of Two Sorted Arrays” problem:
from typing import List
def find_median_sorted_arrays(nums1: List[int], nums2: List[int]) -> float:
# Ensure nums1 is the smaller array to simplify binary search
if len(nums1) > len(nums2):
nums1, nums2 = nums2, nums1
x, y = len(nums1), len(nums2)
low, high = 0, x
while low <= high:
# Choose a partition in nums1
partition_x = (low + high) // 2
# Corresponding partition in nums2
partition_y = (x + y + 1) // 2 - partition_x
# Determine elements on the left and right of the partitions
max_left_x = float('-inf') if partition_x == 0 else nums1[partition_x - 1]
min_right_x = float('inf') if partition_x == x else nums1[partition_x]
max_left_y = float('-inf') if partition_y == 0 else nums2[partition_y - 1]
min_right_y = float('inf') if partition_y == y else nums2[partition_y]
# Check if the partitions are in the correct place
if max_left_x <= min_right_y and max_left_y <= min_right_x:
if (x + y) % 2 == 0:
# If the total number of elements is even, return the average of the middle elements
return (max(max_left_x, max_left_y) + min(min_right_x, min_right_y)) / 2.0
else:
# If the total number of elements is odd, return the larger of the two middle elements
return float(max(max_left_x, max_left_y))
elif max_left_x > min_right_y:
# If the current partition in nums1 is too far to the right, adjust the partition to the left
high = partition_x - 1
else:
# If the current partition in nums1 is too far to the left, adjust the partition to the right
low = partition_x + 1
# If the input arrays are not sorted, raise an error
raise ValueError("Input arrays are not sorted.")
# Example usage:
result = find_median_sorted_arrays([1, 3], [2])
print(result)
assert result == 2.0, "The result should be 2.0"
result = find_median_sorted_arrays([1, 2], [3, 4])
print(result)
assert result == 2.5, "The result should be 2.5"
result = find_median_sorted_arrays([1, 2], [10, 20])
print(result)
assert result == 6.0, "The result should be 2.5"
result = find_median_sorted_arrays([0, 0], [0, 0])
print(result)
assert result == 0.0, "The result should be 0.0"
result = find_median_sorted_arrays([], [1])
print(result)
assert result == 1.0, "The result should be 1.0"
The main idea behind the solution is to find the median of two sorted arrays
(nums1
and nums2
) efficiently using the concept of binary search.
The goal is to partition both arrays in such a way that elements on the left
side of the partitions are smaller than or equal to elements on the right side.
Here’s a step-by-step explanation of the algorithm:
- Ensure
nums1
is the smaller array to simplify the binary search. If not, swap them. - Initialize variables
x
andy
to represent the lengths ofnums1
andnums2
respectively. - Set up a binary search in the smaller array (
nums1
). Use variableslow
andhigh
to represent the search space. - Inside the binary search loop:
- Compute the partition indices (
partition_x
andpartition_y
) based on the current midpoint of the search space innums1
. - Calculate the maximum element on the left (
max_left_x
) and minimum element on the right (min_right_x
) of the partition innums1
. - Repeat the same process for
nums2
to findmax_left_y
andmin_right_y
. - Check if the current partitions satisfy the condition for a valid median. If yes, return the median based on the parity of the total number of elements.
- Adjust the search space based on the relationship between
max_left_x
andmin_right_y
. Ifmax_left_x
is greater, move the partition innums1
to the left; otherwise, move it to the right.
- Compute the partition indices (
- If the binary search completes without finding a valid partition, raise an error indicating that the input arrays are not sorted.
The key insight is that for a valid partition, the maximum element on the left
side (max_left_x
and max_left_y
) should be smaller than or equal to the
minimum element on the right side (min_right_x
and min_right_y
). The
algorithm efficiently adjusts the partitions using binary search to converge to
the correct median. The time complexity of the algorithm is O(log(min(x, y))
),
where x
and y
are the lengths of the input arrays.
Time Complexity:
The time complexity of the solution is O(log(min(m, n))
), where m
and n
are the lengths of the input arrays nums1
and nums2
.
Let’s break down the reasoning behind this time complexity:
-
Binary Search: The core of the algorithm is a binary search that is performed on the smaller of the two arrays (
nums1
). Binary search has a time complexity of O(log(n)
), wheren
is the size of the array being searched. In this case, we are binary searching in the smaller array, so the complexity is O(log(min(m, n))
), wherem
is the size ofnums1
andn
is the size ofnums2
. -
Iterative Steps: The algorithm iterates through the binary search steps a constant number of times. In each iteration, it adjusts the partition based on the comparison of elements in both arrays. The total number of iterations is logarithmic in the size of the smaller array.
-
Overall Complexity: Combining these factors, the overall time complexity is dominated by the binary search step, leading to O(
log(min(m, n))
).
In summary, the algorithm efficiently finds the median by dividing the arrays into two parts and adjusting the partitions using binary search. This logarithmic time complexity ensures efficient performance even for large input sizes.
Space Complexity:
The space complexity is O(1
), as the algorithm uses a constant amount of
extra space regardless of the input size.