So I think it would be O(nlogn) rather than O(n)
]]>Implementing our own hashtable is a great idea. In my web search course last semester, I remember searching for a hashtable implementation where you can give the size as hint during construction, so that it would perform less resize operations. Because I already knew that I’ll insert millions of elements while implementing a search engine. I think the default size of a dictionary in python is 8, and the load factor threshold for resizing is 2/3. The size is multiplied by 4 during resizing unless the hashtable already big (50,000), otherwise it doubles the size.
]]>O(n * n/m)
where m is # keys and n is # elements. Assume m=1, now you have n^2 right?
@Arden, I think Python “set” is not always O(1) on find and insert as documented here. If you can somehow instantiate the set with specifying number of keys then you can choose m=n and achieve worst case O(1).
However if you just instantiate it as set()
and do not avoid duplicate pairs and assume (x,y)!=(y,x) then underlying Python “s”et implementation needs to do bucketing which can lead to O(n) for find in worst case, as documented.
I think currently not possible to specify # of keys in hash table that is staying under set implementation. Python can be problematic, however Java also maintains hash table with “load factor”. We should definitely implement our own hash table… :) What do you think?
]]>A great solution to a problem that’s seen on many interview routes! Well done! Again, I appreciate the way you present the least optimal solutions first and slowly lead towards the one that’s optimal. This is a great interview strategy too. Very nice!
]]>