Solving the Codility Molybdenum challenge

Codility is an online platform for coding challenges, that I like to take part in from time to time. Their current challenge is called Molybdenum and here’s how I solved it. The complete description of the challenge is available online at codility.com, the relevant part is copied below.

The leader of the array is a value that occurs in more than half of the elements of the array, and the segment of the array is a sequence of consecutive elements of the array.

You can modify A by choosing exactly one segment of length K and increasing by 1 every element within that segment.

The goal is to find all of the numbers that may become a leader after performing exactly one array modification as described above

Definitions

The textual description is a bit wordy, so let’s break it down in Python. Pythons name for an array is list, so we have a list A that contains some numbers. Counting the elements of the array is always helpful, so I set N = len(A).

We can now select one subset of A of length K. If we call the subset B, this could look like B = A[i:i+K]. The minimum value for i is obviously i = 0, the maximum value is i = N - K, which selects the last K entries of A.

Having selected a subset B, we have to increment every element of it by 1 and place them back into A. There may be nicer ways to implement this, but this straightforward implementation works just fine:

def increment_from_index(A, K, i):
    B = A[i:i+K]
    B = [x+1 for x in B]
    A[i:i+K] = B

The leader is any element of A that occurs more than N/2 times. Python has a built-in Counter class that makes it easy to check this condition:

def get_leader(A):
    """Returns the leader of A or None"""
    from collections import Counter
    counter = Counter(A)
    N = len(A)
    element, count = counter.most_common(1)[0]
    leader = None
    if count > N/2:
        leader = element
    return leader

Trivial Solution

The brute force approach simply chooses every possible segment, tests for a leader each time, and returns the list of leaders. There’s only a couple of things to consider:

  1. The methods defined above mutate A in place so we need to start with a fresh copy for every new segment. This copy is named C here.
  2. The list of possible leaders should be sorted in ascending order and each possible leader may only appear once, so some cleanup is necessary.

The code for the brute force solution is given below. Note that range(x, y) returns integers from x to y, excluding y. Since our maximum value for i is N-K, the call adds +1.

def solution(K, M, A):
    from copy import copy
    N = len(A)
    C = copy(A)
    leaders = set()
    for i in range(0, N-K+1):
        A = copy(C)
        increment_from_index(A, K, i)
        leader = get_leader(A)
        if leader:
            leaders.add(leader)
    return sorted(list(leaders))

This algorithm already earns a Silver Award, even though it fails all of the performance tests. The reason is obvious: Assuming a fixed K, there are N-K = O(N) possible segments in A and for each segment we need to count the occurrence of every element to determine the leader, which is again in O(N), for an overall time complexity of O(N2).

O(N) Algorithm

We can find a faster algorithm by thinking about the changes we introduce in A if we choose the segment starting at i+1 instead of i. Regardless of the size of A, the results of X = increment_from_index(A, K, i) and Y = increment_from_index(A, K, i+1) will only vary in two spots, since A[i] is only incremented in X and A[i+K] is only incremented in Y. We can therefore define a function to get Y from X directly:

def move_segment(A, K, i):
    """Moves the segment starting at i one element to the right."""
    A[i] -= 1
    A[i+K] += 1

Switching to this function using the modified implementation of solution seen below does not make a big difference yet, because we still count the occurrences of all elements each time.

def solution(K, M, A):
    N = len(A)
    increment_from_index(A, K, 0)
    leaders = set()
    initial_leader = get_leader(A)
    if initial_leader:
        leaders.add(initial_leader)
    for i in range(0, N-K):
        move_segment(A, K, i)
        leader = get_leader(A)
        if leader:
            leaders.add(leader)
    return sorted(list(leaders))

However, we can forego the re-counting completely, because we already know that the occurrences of most of the elements will not change between consecutive segments. If we move from i to i+1, the occurrences of the element A[i] will decrease by one, since it will be replaced by A[i] - 1, which will in turn increase by one. Analogously, A[i+K] will appear once less often, and A[i+K] + 1 once more. We can now use this to update the counter directly, too. And since only A[i+K] + 1 and A[i] - 1 occur more often after the change, only those can become a leader due to the change, so we only need to check the counts for those two values after each step. The complete solution is given below and yields the sought-after Golden Award with a 100% rating in each category.

def get_leader(counter):
    """Returns the leader in the counter or None"""
    N = sum(counter.values())
    element, count = counter.most_common(1)[0]
    leader = None
    if count > N/2:
        leader = element
    return leader

def increment_from_index(A, K, i):
    B = A[i:i+K]
    B = [x+1 for x in B]
    A[i:i+K] = B

def move_segment(A, K, i):
    """Moves the segment starting at i one element to the right."""
    A[i] -= 1
    A[i+K] += 1
    
def solution(K, M, A):
    from collections import Counter
    N = len(A)
    increment_from_index(A, K, 0)
    counter = Counter(A)
    initial_leader = get_leader(counter)
    leaders = set()
    if initial_leader:
        leaders.add(initial_leader)
    for i in range(0, N-K):
        indices = [i, i+K]
        for index in indices:
            element = A[index]
            counter[element] -= 1
        move_segment(A, K, i)
        for index in indices:
            element = A[index]
            counter[element] += 1
            count = counter[element]
            if count > N/2:
                leaders.add(element)
    return sorted(list(leaders))