[Programming Pearls] Column 2 – Find Missing Number

Problem:
Similarly to the problem A described in column 2: given a range [lower_bound, upper_bound], and a sorted list l; elements in l are within given range; some numbers within given range are missing in the list; find out one.

Solution #1:
Similarly to binary search, the key is to divide the list and refine the range in cycle. On each loop cycle:
1. Check end condition #1: the first element > lower_bound, or last element < upper_bound. 2. Divide the 'list' into two sub lists (find pivot index in the middle, and split the list to [0, pivot] and [pivot + 1, last]. 3. Check end condition #2: compare l[pivot] and l[pivot + 1] and check if difference is 1. 4. Compare the length of each sublist, to the value difference (sublist[len(sublist)-1] - sublist[0]). If length is less, then update the loop variables: list <= sublist, range [sublist[0], sublit[last]]. Code for solution #1:

def prog_2_A_find_missing(l, lower_bound, upper_bound):
    """
    >>> print(prog_2_A_find_missing([1, 2, 3, 5, 7, 8], 1, 8))
    4
    >>> print(prog_2_A_find_missing([1, 2, 3, 4, 6, 7], 1, 7))
    5
    """
    lower = 0
    upper = len(l) - 1
    missing = -1
    # on loop condition:
    # '=' means the list to be scanned can only contain one element
    # loop assertions:
    # - l[lower : upper] (including l[upper])indicates sorted list that contains missing numbers.
    # - range(lower_bound, upper_bound) indicates the range that missing numbers are within
    # - on each cycle, problem can be defined as:
    #   sorted list l[lower : upper] contains numbers within range(lower_bound, upper_bound),
    #   try to find the missing number (a number that is within range but not present in the list)
    while lower <= upper:
        # end conditions: found that given boundaries exceed list boundaries
        if l[lower] > lower_bound:
            missing = lower_bound
            break
        if l[upper] < upper_bound:
            missing = upper_bound
            break
        # deal with case where pivot_right > upper
        # this could happen when there is only one element in the list
        pivot_left = lower + ((upper - lower) >> 1)
        pivot_right = pivot_left + 1
        if pivot_right > upper:
            break
        # another end condition:
        # the missing one is between pivot_left and pivot_right
        if l[pivot_right] - l[pivot_left] > 1:
            missing = l[pivot_right] - 1
            break
        gap_left = l[pivot_left] - l[lower] + 1
        gap_right = l[upper] - l[pivot_right] + 1
        len_left = pivot_left - lower + 1
        len_right = upper - pivot_right + 1
        # compare length to find out which section contains missing element
        if gap_left > len_left:
            # find missing in left section
            upper = pivot_left
            upper_bound = l[upper]
            # does not really matter whether we have following statement
            # lower_bound = l[lower]
        elif gap_right > len_right:
            # find missing in right section
            lower = pivot_right
            lower_bound = l[lower]
            # does not really matter whether we have following statement
            # upper_bound = l[upper]
        # if they all equal, no missing element
        else:
            break
    return missing

Solution #2:
The solution is similar, except it does not require the list is sorted.
On each cycle, the list will be divided to two groups, one containing even numbers and the other containing odd numbers.

def prog_2_A_find_missing_bitwise(l, lower_bound, upper_bound):
    """
    >>> print(prog_2_A_find_missing_bitwise([1, 2, 3, 5, 7, 8], 1, 8))
    6
    >>> print(prog_2_A_find_missing_bitwise([1, 2, 3, 4, 6, 7], 1, 7))
    5
    """
    def split(numbers):
        set_odd = [num >> 1 for num in numbers if num & 1 == 1]
        set_even = [num >> 1 for num in numbers if num & 1 == 0]
        return (set_odd, set_even)
    # TODO: odd count of numbers
    # corner case: when numbers set has only one element, it will still work
    # cause either set_odd or set_even will be empty and that one will be the missing one
    numbers = l
    numbers_range = range(lower_bound, upper_bound + 1)
    # count = (upper_bound - lower_bound + 1) >> 1
    bit_index = 0
    num_missing = 0
    # loop assertions:
    # - the list numbers contains missing numbers
    # - bit_index and num_missing records the path to calculate missing number
    while len(numbers) > 0:
        set_odd, set_even = split(numbers)
        range_odd, range_even = split(numbers_range)
        if len(set_odd) < len(range_odd):
            numbers = set_odd
            numbers_range = range_odd
            num_missing += (1 << bit_index)
        elif len(set_even) < len(range_even):
            numbers = set_even
            numbers_range = range_even
            num_missing += (0 << bit_index) # can be ignored, though...
        #count >>= 1
        bit_index += 1
    return num_missing

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>