Comments on: Programming Interview Questions 1: Array Pair Sum /2011/09/17/programming-interview-questions-1-array-pair-sum/?utm_source=rss&utm_medium=rss&utm_campaign=programming-interview-questions-1-array-pair-sum Information Retrieval and Machine Learning Fri, 10 Feb 2012 11:25:40 +0000 hourly 1 http://wordpress.org/?v=3.3 By: Arden /2011/09/17/programming-interview-questions-1-array-pair-sum/#comment-519 Arden Wed, 16 Nov 2011 20:20:19 +0000 /?p=434#comment-519 The worst case complexity of find in set can be as bad as O(N) as Ahmet mentioned above. But I think it's safe to use the average case constant complexity for sets and hashtables during an interview, by mentioning the worst case behavior. To be technically precise I should write Omega(N) since big-O is the worst case bound, but these articles are intended to focus more on common interview practices. But you're right, the very worst case complexity using C++ STL set is O(NlogN). But I don't think interviewers will object to O(N) as long as you mention the worst case, that's my experience at least. The worst case complexity of find in set can be as bad as O(N) as Ahmet mentioned above. But I think it’s safe to use the average case constant complexity for sets and hashtables during an interview, by mentioning the worst case behavior. To be technically precise I should write Omega(N) since big-O is the worst case bound, but these articles are intended to focus more on common interview practices. But you’re right, the very worst case complexity using C++ STL set is O(NlogN). But I don’t think interviewers will object to O(N) as long as you mention the worst case, that’s my experience at least.

]]>
By: Achal /2011/09/17/programming-interview-questions-1-array-pair-sum/#comment-495 Achal Mon, 14 Nov 2011 11:16:52 +0000 /?p=434#comment-495 @Arden If I am using C++ STL Set , 1. Insert takes logarithmic time , but it is amortized constant. 2. Find takes logarithmic time. So I think it would be O(nlogn) rather than O(n) @Arden
If I am using C++ STL Set ,
1. Insert takes logarithmic time , but it is amortized constant.
2. Find takes logarithmic time.

So I think it would be O(nlogn) rather than O(n)

]]>
By: Arden /2011/09/17/programming-interview-questions-1-array-pair-sum/#comment-275 Arden Sat, 15 Oct 2011 05:32:05 +0000 /?p=434#comment-275 You're totally right Ahmet. As the load factor of the set increases, the worst case complexity of a single operation becomes linear. But I would assume that after a certain load factor python would resize the set by doubling its size. So, the average time for an operation would still be amortized O(1), but still for some elements it can be O(N) in the worst case as you said. However, during an interview I suppose it's safe to assume O(1) for operations on sets and hashtables. Implementing our own hashtable is a great idea. In my web search course last semester, I remember searching for a hashtable implementation where you can give the size as hint during construction, so that it would perform less resize operations. Because I already knew that I'll insert millions of elements while implementing a search engine. I think the default size of a dictionary in python is 8, and the load factor threshold for resizing is 2/3. The size is multiplied by 4 during resizing unless the hashtable already big (50,000), otherwise it doubles the size. You’re totally right Ahmet. As the load factor of the set increases, the worst case complexity of a single operation becomes linear. But I would assume that after a certain load factor python would resize the set by doubling its size. So, the average time for an operation would still be amortized O(1), but still for some elements it can be O(N) in the worst case as you said. However, during an interview I suppose it’s safe to assume O(1) for operations on sets and hashtables.

Implementing our own hashtable is a great idea. In my web search course last semester, I remember searching for a hashtable implementation where you can give the size as hint during construction, so that it would perform less resize operations. Because I already knew that I’ll insert millions of elements while implementing a search engine. I think the default size of a dictionary in python is 8, and the load factor threshold for resizing is 2/3. The size is multiplied by 4 during resizing unless the hashtable already big (50,000), otherwise it doubles the size.

]]>
By: Ahmet Alp Balkan /2011/09/17/programming-interview-questions-1-array-pair-sum/#comment-269 Ahmet Alp Balkan Fri, 14 Oct 2011 22:34:22 +0000 /?p=434#comment-269 @George I think complexity is <code>O(n * n/m)</code> where m is # keys and n is # elements. Assume m=1, now you have n^2 right? @Arden, I think Python "set" is not always O(1) on find and insert as documented here. http://wiki.python.org/moin/TimeComplexity If you can somehow instantiate the set with specifying number of keys then you can choose m=n and achieve worst case O(1). However if you just instantiate it as <code>set()</code> and do not avoid duplicate pairs and assume (x,y)!=(y,x) then underlying Python "s"et implementation needs to do bucketing which can lead to O(n) for find in worst case, as documented. I think currently not possible to specify # of keys in hash table that is staying under set implementation. Python can be problematic, however Java also maintains hash table with "load factor". We should definitely implement our own hash table... :) What do you think? @George I think complexity is O(n * n/m) where m is # keys and n is # elements. Assume m=1, now you have n^2 right?

@Arden, I think Python “set” is not always O(1) on find and insert as documented here. If you can somehow instantiate the set with specifying number of keys then you can choose m=n and achieve worst case O(1).

However if you just instantiate it as set() and do not avoid duplicate pairs and assume (x,y)!=(y,x) then underlying Python “s”et implementation needs to do bucketing which can lead to O(n) for find in worst case, as documented.

I think currently not possible to specify # of keys in hash table that is staying under set implementation. Python can be problematic, however Java also maintains hash table with “load factor”. We should definitely implement our own hash table… :) What do you think?

]]>
By: vs /2011/09/17/programming-interview-questions-1-array-pair-sum/#comment-229 vs Tue, 04 Oct 2011 04:28:48 +0000 /?p=434#comment-229 Arden, A great solution to a problem that's seen on many interview routes! Well done! Again, I appreciate the way you present the least optimal solutions first and slowly lead towards the one that's optimal. This is a great interview strategy too. Very nice! Arden,

A great solution to a problem that’s seen on many interview routes! Well done! Again, I appreciate the way you present the least optimal solutions first and slowly lead towards the one that’s optimal. This is a great interview strategy too. Very nice!

]]>
By: George /2011/09/17/programming-interview-questions-1-array-pair-sum/#comment-177 George Tue, 20 Sep 2011 22:35:04 +0000 /?p=434#comment-177 I came up to htable solution in the first place. The complexity is O(n+m) where n is input size, m amount of htable keys. Which is linear. Keep your work going. Nice start. I came up to htable solution in the first place. The complexity is O(n+m) where n is input size, m amount of htable keys. Which is linear.
Keep your work going. Nice start.

]]>