# Programming Interview Questions 4: Find Missing Element

This question can be solved efficiently with a very clever trick. There is an array of non-negative integers. A second array is formed by shuffling the elements of the first array and deleting a random element. Given these two arrays, find which element is missing in the second array. Here is an example input, the first array is shuffled and the number 5 is removed to construct the second array.

The naive way to solve it is for every element in the second array, check whether it appears in the first array. Note that there may be duplicate elements in the arrays so we should pay special attention to it. The complexity of this approach is O(N^2). A more efficient solution is to sort the first array, so while checking whether an element in the first array appears in the second, we can do binary search. But we should still be careful about duplicate elements. The complexity is O(NlogN). If we don’t want to deal with the special case of duplicate numbers, we can sort both arrays and iterate over them simultaneously. Once two iterators have different values we can stop. The value of the first iterator is the missing element. This solution is also O(NlogN). Here is the algorithm for this approach:

```def findMissingNumber1(array1, array2): array1.sort() array2.sort() for num1, num2 in zip(array1, array2): if num1!=num2: return num1 return array1[-1]```

Let’s see whether we can do better. In most interviews, we would be expected to come up with a linear time solution. We can use a hashtable and store the number of times each element appears in the second array. Then for each element in the first array we decrement its counter. Once hit an element with zero count that’s the missing element. Here is the algorithm:

```def findMissingNumber2(array1, array2): d=collections.defaultdict(int) for num in array2: d[num]+=1 for num in array1: if d[num]==0: return num else: d[num]-=1```

The time complexity is optimal O(N) but the space complexity is also O(N), because of the hashtable. Ideally we would like to have constant space complexity. One possible solution is computing the sum of all the numbers in array1 and array2, and subtracting array2’s sum from array1’s sum. The difference is the missing number in array2. However, this approach is somewhat problematic. What if the arrays are too long, or the numbers are very large. Then overflow will occur while summing up the numbers.

By performing a very clever trick, we can achieve linear time and constant space complexity without any problems. Here it is: initialize a variable to 0, then XOR every element in the first and second arrays with that variable. In the end, the value of the variable is the result, missing element in array2. Classy, isn’t it? Here is the code:

```def findMissingNumber3(array1, array2): result=0 for num in array1+array2: result^=num return result```

Let’s analyze why this approach works. What happens when we XOR two numbers? We should think bitwise, instead of decimal. XORing a 4-bit number with 1011 would flip the first, third, and fourth bits of the number. XORing the result again with 1011 would flip those bits back to their original value. So, if we XOR a number two times with some number nothing will change. We can also XOR with multiple numbers and the order would not matter. For example, say we XOR the number n1 with n2, then XOR the result with n3, then XOR their result with n2, and then with n3. The final result would be the original number n1. Because every XOR operation flips some bits and when we XOR with the same number again, we flip those bits back. So the order of XOR operations is not important. If we XOR a number with some number an odd number of times, there will be no effect.

Above we XOR all the numbers in array1 and array2. All numbers in array2 also appear in array1, but there is an extra number in array1. So the effect of each XOR from array2 is being reset by the corresponding same number in array1 (remember that the order of XOR is not important). But we can’t reset the XOR of the extra number in array1, because it doesn’t appear in array2. So the result is as if we XOR 0 with that extra number, which is the number itself. Since XOR of a number with 0 is the number. Therefore, in the end we get the missing number in array2. The space complexity of this solution is constant O(1) since we only use one extra variable. Time complexity is O(N) because we perform a single pass from the arrays.

This question demonstrates the power of bitwise operations, and how to use them effectively to achieve optimal solutions. It’s one of my favorite interview questions.

VN:F [1.9.22_1171]
Programming Interview Questions 4: Find Missing Element, 9.6 out of 10 based on 103 ratings
This entry was posted in Programming Interview. Bookmark the permalink.
• http://ahmetalpbalkan.com ahmet alp balkan

Gercekten guzeldi :)

• ege

Arden güzel soru – güzel cevap ancak tüm elemanları toplama fikri daha sade bence
kod yazarken senden sonra bakacak adamın anlayabilmesi de çok önemli bir proje için :)

• Arden

Haklisin Ege okunabilirlik cok onemli tabi, ama interview’da asil amac en optimize sekilde cozmek. Hatta interviewer’in gormedigi orjinal guzel bir cozum uretirsen senden iyisi yok..

• http://ahmetalpbalkan.com ahmet alp balkan

Peki ya xor yerine sum kullanirsak herhangi bir drawback’i olur mu sence?

• Arden

Python’da sorun olmayacaktir. Cunku “Python seamlessly converts a number that becomes too large for an integer to a long. And long integers have unlimited precision”. Ama Python’da uzun listelerde XOR kisa listelerde sum daha hizli calisiyo.

• vs

Hey Arden,

Your solutions to the problem are brilliant! I like how you analyze various approaches and slowly lead the reader to the optimal solution (rather than presenting the reader with the best solution at the beginning).

Keep up the good work and thanks so much for sharing!

• anjali

I agree! Thank you so much!

• s

Hi Arden,

What about an alternative for the “adding up” solution as follows. Instead of summing the first and second array, then subtracting from each other, which can cause overflow, we can interleave between adding and subtracting. That is,

-take the 1st element in the 1st array
– subtract it from the 1st element in the 2nd,
– add to the 2nd element in the 1st array,
– subtract the result from the 2nd element in the 2nd array.
– so on.

We have an O(n) in time and O(1) in space. Let me know what you think. Thanks for your wonderful blog Ardan.

• Arden

Thanks a lot for the comment. This is definitely a better solution than adding up, but it may also cause overflow. Imagine large numbers being in front of array1 and small numbers being in the end. And after shuffling array2 contains small numbers in front and large numbers move to the end. Then while subtracting and adding the sum will continually increase, which may lead to overflow. But this is the extreme worst case of course, and probably very unlikely to happen if we have a good shuffle function. Therefore, your approach is much better than simply adding up two arrays first and then subtracting. Because now the intermediate sum increases much slowly. Thanks for noticing. I hope you enjoy the blog..

• t

just balance the residue around 0, i.e. keeps subtracting until it’s below 0, then adding back until it goes above 0.

• melih

This will add more complexity (additional if checks), I think the usage of XOR is the most brilliant way to solve this problem. Great blog Arden and it is very helpful, Teşekkürler:)

• ANONYMOUS

another interesting question: two numbers are missing.

• Arden

Yes that’s also a good one involving some math. I can write about that as well, thanks for the advice.

• libo

is it possible to post the code for missing two elements? thanks for sharing in advance, and i forget the advanced math after graduation.

Ապրես, շատ սիրուն լուծումներ ես առաջարկում, Արդեն:

• Anonymous

if only one element is missing we can just add all the numbers in array 1 and array 2 and them find the difference , we will get the missing number

this will have O(n) time complexity.

int sumA = a[0];
int sumB = 0;

for(int i =1 ; i< a.length ; i++){
sumA = sumA + a[i];
sumB= sumB + b[i-1];
}

int missingNo = sumA – sumB;

• Arden

I mention this approach after findMissingNumber2 with its problem:
“One possible solution is computing the sum of all the numbers in array1 and array2, and subtracting array2′s sum from array1′s sum. The difference is the missing number in array2. However, this approach is somewhat problematic. What if the arrays are too long, or the numbers are very large. Then overflow will occur while summing up the numbers.”

• Anonymous

// c++/java code
int findMissing(int a[], int b[], int aSize, int bSize) {

int result = 0;
int i=0; int j=0;
for (i=0, j=0; i<aSize; i++) {
result ^= a[i];
if (j<bSize) {
result ^= b[j++];
}
}
return result;
}

• Steve

If you used unsigned integers for the sum then there is no problem with overflow is there? The value just “wraps” and the end result is correct.

I woul like to add a special note for the “XOR” solution: this will work iff the only duplicate value is not zero.

• Kannappan

Yep, better to use the reduce function
return reduce(lambda x, y: x^y, array1 + array2)

• Kannappan

Thought about it again. Even if the duplicate value is zero, the solution provided by the author should work. because, XOR any number with zero returns that same number. So, your statement is wrong

• Harsh

Brilliant !!

• Lucas

Your xor solution actually uses linear space since Python will actually construct the the array array1+array2 in the for loop. You should be able to get around this either by using itertools.chain or by splitting into two for loops.

• Petar

There is another simple O(N) solution. We only need to sum all the elements of the first array, and than subtract the sum of elements of the second array. The result will be the missing number. Only problem here (same as the last solution presented) is that it gives the same result in the case of the missing number 0, and the case when there is no missing number.

• rahul

Sum may hit the overflow condition. so i feel it is better to go with XOR approatch

• Roxanne

no it won’t. You don’t necessarily have to add all from Array1 and then subtract from Array 2. You can pick first number from Array1 and subtract from first number from Array 2, then add second number from Array1 and … so on and so forth. It’ll work because just like XOR, +and – are commutative.

• Derek Chow

int [] Array1= {Integer.MAX_VALUE-1, 2, 3};
int [] Array2= {2, Integer.MAX_VALUE-1};
in the sample input above, your assumption easily breaks(overflow) it would not work since the array is not sorted

• Jinesh

An if condition checking the length of both the arrays at the start will avoid this issue..

• Petar

*0

• Funsi

missing_entry = sum(array1) – sum(array2)

??

• MdT

Quick question, why do you say “If we XOR a number with some number an odd number of times, there will be no effect.”
As you said, n1 XOR’d with n2 twice would give n1. If XOR’d 4 times, it would give n1 again. Then why “odd number of times”?

• Sriram

Arden, How about add all the elements of array1 and call it sum1, next add all the elements of array2 and call it sum2, sum1-sum2 gives the missing element :)

• free

my thought exactly when I read the question…

• vickychijwani

The sum can easily overflow the range of int / long.

• CuriousMe

What if we simply add the two arrays. Since one element is missing in the second array and the array contains all non negative numbers, the difference in the sum would be the number missing. This approach works in the above example, can any one suggest a counter example?

• Rahul

What if one of the elements is Integer.MAX_VALUE

• Maccha

Awesome Maccha!!

• Ashish Thakran

You can find out the missing
number in array by following algorithm:

/* getMissingNumber takes array and size of array as arguments*/

int getMissingNumber (int arr[], int num)

{

int i;

/* For xor of all the elemets in
arary */

int x1 = a[0];

/* For xor of all the elemets
from 1 to n+1 */

int x2 = 1;

for (i = 1; i< n; i++)

x1 = x1^a[i];

for ( i = 2; i <= n+1;
i++)

x2 = x2^i;

return (x1^x2);

}

I also found some more possible solutions. you can check out below link for
more solutions:

http://newtechnobuzzz.blogspot.in/2014/07/find-out-missing-number-in-array-in-java.html

• Ashish Thakran

You can find out the cycle or loop in a single linked list by using two pointer approach.

Below link can be useful to find out the algorithm to find cycle or loop in linked list

Find out loop or cycle in linked list in java

• Ashish Thakran

We can find out the duplicate element by using Set, list or trvaersing using loop on array.

Below link can be useful to find out the algorithm to find duplicate or repeated elements in an array in java

http://newtechnobuzzz.blogspot.in/2014/07/find-out-repeated-or-duplicate-element.html

http://newtechnobuzzz.blogspot.in/2014/07/find-out-repeated-or-duplicate-element.html

Awesome. . . thank u soo much sir. . .

Even easier if you use Sets. Just include everything from first array while reading data. Then while reading second array, just remove those elements from the set. At the you’ll have only 1 element in set, which’d be the one we need :)

• Rajan Kalra

Hi
Can you please tell me why the running time for very first algo given is O(nlogn), we are not searching the second array but we are just iterating the two arrays simuoltaneously and breaking when the number doesn’t matches. I think the running time for time would be O(n). Please tell me where I am going wrong!

• Rajan Kalra

I just got it..the time taken is for sorting. Considering merge sort perhaps.

• anchal dhiman

Nice Example… it help me a lot and visit more java example

• Šimon

Correct me if I am wrong, but overflow is not an unsecified error – it is well defined operation even though it is getting “wrong” outcome. We can sum up set of negative and positive numbers and even when our variable is temorarily in the overflow state, the result will be correct (that I think the reason there was no overflow checking in Java before 8 version).
Therefore we can just do simple sum of the 2 arrays into 2 variables and then iterate over the longer one and try whether adding and element would leed to the same sum as for the second array.

• Architect

``` //Java 8 Streams & Lambdas solution by the Architect https://disqus.com/by/op8rv315/ List array1 = new ArrayList(Arrays.asList(4, 1, 0, 2, 9, 6, 8, 7, 5, 3)); List array2 = Arrays.asList(6, 4, 7, 2, 1, 0, 8, 3, 9);```

``` array1.addAll(array2); Optional missing = array1.stream() .reduce((x, y) -> x^y); ```

```System.out.println(missing.get()); ```

• Rohit Kumar

Unsroted array : O(n)
SortedArray: O(logn)
.

public class FindMissingElementInSortedArray {

public static void checkMissingElement(int[] ar, int[] br) {
if (ar == null || br == null) {
System.out.println(“//throwException”);
return;
}

int left=0,right=ar.length-1;
int mid=0;

while(left>=0 && right<ar.length){
mid=(left+right)/2;
if(ar[mid]==br[mid]){
left=mid+1;
}else if(ar[mid]==br[mid-1]){
right=mid-1;
}else{
System.out.println("Missing Element is "+ar[mid]);
break;
}
}

}

public static void main(String[] args) {

int[] ar = { 2, 4, 6, 8, 9, 12, 14, 17, 18, 21, 25, 28, 29, 30 };
int[] br = { 2, 4, 6, 8, 9, 12, 14, 17, 18, 21, 25, 28, 29, 30 };
checkMissingElement(ar, br);
}

}