Contains Duplicate

Part I

This is a 3 sets of array problems from Airbnb. Part I would be very easy and straight forward:

Given an array of integers, find if the array contains any duplicates. Your function should return true if any value appears at least twice in the array, and it should return false if every element is distinct.


My Solution

The easiest way would be sort the array first and check if the current element is different with the previous element. The time complexity would be O(n log n).

def contain_duplicate(nums):
  """
  :type nums: List[int]
  :rtype: bool
  """
  if not nums or len(nums) == 1:
    return False
  nums = sorted(nums)
  for i in range(1, len(nums)):
    if nums[i] == nums[i-1]:
      return True
  return False

This can be optimized to O(n) using hash table.

def contain_duplicate(nums):
  table = {}
  for n in nums:
    if n in table:
      return True
    table[n] = 1
  return False

Part II

Given an array of integers and an integer k, find out whether there are two distinct indices i and j in the array such that nums[i] = nums[j] and the difference between i and j is at most k.


My Solution

The approach is the same, store the nums as key and index as value, if the nums exist in the table compare both of indexes.

def contain_duplicate(nums, k):
  table = {}
  for i, n in enumerate(nums):
    if n in table and i - table[n] <= k:
      return True
    table[n] = i
  return False

Part III

Given an array of integers, find out whether there are two distinct indices i and j in the array such that the difference between nums[i] and nums[j] is at most t and the difference between i and j is at most k.


My Solution

This problem is very tricky, compare to the first two problems. In order to get O(n), we can use the idea from bucket sort. We keep a bucket with size k that will dynamically change. If we use BST to hold the nums, since BST would be O(log n) to access the element, the whole solution would cost O(n log n). But if we use bucket (or sliding window) by applying hash table, since lookup and deleting in hashtable only cost O(1), the whole solution would cost O(n).

For example, suppose we have t=2 and we found 3 in the array. The possible nums would be 1,2,3,4,5, by dividing with t, 3 would be in the bucket #1 (3 divided by 2) and the possible numbers can be found in bucket #0 or #2 or #1 itself. We then maintain the size of bucket to be k and delete the bucket that is far away from the index. It is guaranteed that at any time there will be at most min(k, len(nums)). The solution is as below:

def contain_duplicate(nums, k, t):
  buckets = {}
  offset = 1
  for i, n in enumerate(nums):
    # beware of t=0
    b_num = n/t if t else n  
    for b in range(b_num - offset, b_num + offset + 1):
      if b in buckets and abs(buckets[b] - n) <= t:
        return True

    buckets[b_num] = n

    # remove index that is too far
    if len(buckets) > k:
      b_del = nums[i-k]/t if t else nums[i-k]
      del buckets[b_del]
  return False

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s