Comments on: Programming Interview Questions 1: Array Pair Sum

By: Arden

Arden — Wed, 16 Nov 2011 20:20:19 +0000

The worst case complexity of find in set can be as bad as O(N) as Ahmet mentioned above. But I think it’s safe to use the average case constant complexity for sets and hashtables during an interview, by mentioning the worst case behavior. To be technically precise I should write Omega(N) since big-O is the worst case bound, but these articles are intended to focus more on common interview practices. But you’re right, the very worst case complexity using C++ STL set is O(NlogN). But I don’t think interviewers will object to O(N) as long as you mention the worst case, that’s my experience at least.

]]>

By: Achal

Achal — Mon, 14 Nov 2011 11:16:52 +0000

@Arden
If I am using C++ STL Set ,
1. Insert takes logarithmic time , but it is amortized constant.
2. Find takes logarithmic time.

So I think it would be O(nlogn) rather than O(n)

]]>

By: Arden

Arden — Sat, 15 Oct 2011 05:32:05 +0000

You’re totally right Ahmet. As the load factor of the set increases, the worst case complexity of a single operation becomes linear. But I would assume that after a certain load factor python would resize the set by doubling its size. So, the average time for an operation would still be amortized O(1), but still for some elements it can be O(N) in the worst case as you said. However, during an interview I suppose it’s safe to assume O(1) for operations on sets and hashtables.

Implementing our own hashtable is a great idea. In my web search course last semester, I remember searching for a hashtable implementation where you can give the size as hint during construction, so that it would perform less resize operations. Because I already knew that I’ll insert millions of elements while implementing a search engine. I think the default size of a dictionary in python is 8, and the load factor threshold for resizing is 2/3. The size is multiplied by 4 during resizing unless the hashtable already big (50,000), otherwise it doubles the size.

]]>

By: Ahmet Alp Balkan

Ahmet Alp Balkan — Fri, 14 Oct 2011 22:34:22 +0000

@George I think complexity is O(n * n/m) where m is # keys and n is # elements. Assume m=1, now you have n^2 right? @Arden, I think Python "set" is not always O(1) on find and insert as documented here. http://wiki.python.org/moin/TimeComplexity If you can somehow instantiate the set with specifying number of keys then you can choose m=n and achieve worst case O(1). However if you just instantiate it as set() and do not avoid duplicate pairs and assume (x,y)!=(y,x) then underlying Python "s"et implementation needs to do bucketing which can lead to O(n) for find in worst case, as documented. I think currently not possible to specify # of keys in hash table that is staying under set implementation. Python can be problematic, however Java also maintains hash table with "load factor". We should definitely implement our own hash table... :) What do you think?

By: vs

vs — Tue, 04 Oct 2011 04:28:48 +0000

Arden,

A great solution to a problem that’s seen on many interview routes! Well done! Again, I appreciate the way you present the least optimal solutions first and slowly lead towards the one that’s optimal. This is a great interview strategy too. Very nice!

]]>

By: George

George — Tue, 20 Sep 2011 22:35:04 +0000

I came up to htable solution in the first place. The complexity is O(n+m) where n is input size, m amount of htable keys. Which is linear.
Keep your work going. Nice start.

]]>