Telephone problem - willing to pay for help...

aaapittsburgh

New member
Joined
Jan 8, 2007
Messages
3
Hi,

The following is a problem I need help with and I am willing to pay for help if necessary. Any help with it would be greatly appreciated.

Two tables in a database:

Table1 contains a list of phone numbers
Table2 contains a list of phone numbers as well

I would like to create a Table3, in which Table3 contains all numbers
from Table1 that is not in Table2. I am looking for the shortest
runtime possible, keeping in mind that you can use whatever method(s)
you deem necessary.

Table1 contains 30 Million rows,

Table2 contains 2000 rows.

given a regular SQL expression, it will yield Big O(m*n)

Where m = rowcount of Table1
and n = rowcount of Table2

Generate for me, a method in which, runtime will yield Big O (m log2
n).

I don't need code, I want to hear your logic. Table1 is customers, Table2 is a
list of prepaid phone numbers. Table3 is list of people to bill.
 
We have a contributor, Dennis, who is a first rate programmer.
You can only hope that he will answer this problem.
But please know, this is not a problem in mathematics.
There is no way that this problem can be said to be “well defined”.
 
pka said:
We have a contributor, Dennis, who is a first rate programmer.
You can only hope that he will answer this problem.
But please know, this is not a problem in mathematics.
There is no way that this problem can be said to be “well defined”.
I understand. I'm more interested in the logic behind it than a clear cut answer. All the SQL stuff isn't really required.
 
What's in "a row"? The 10 digits of a phone number?

Well, with only 1 in 15,000 (30 million / 2000) being prepaid,
seems to me that the "logic" is to skip all "zero balances owing"
during the "invoicing run".

Anyway, I've never done any "database programming", so can't
make any money off you :shock:
 
At first, I thought the answer was simply:

select phone_no from table1
minus
select phone_no from table2;

But the guy who wrote the problem told be this was incorrect. His exact response to me was:

it's not too simplistic, but it is incorrect. This will still
yield
a big O(m*n).

Give it one more try, you are thinking too much in terms of DB..

Ask yourself, what are the only structures that would yield
BigO(nlog2n)? Answer that, and you will get your answer..


Any ideas?
 
What d'heck is a "BigO" :shock:
(almost BingO!)

To me, it could be the "o" in King K"o"ng :idea:
 
Usually when theres a log of base 2 involved, you'll likely be dealing with a binary tree structure (or similar) or a sort such as merge-sort.

Heres one way I've come up with. Its a modification of QuickSort. Create a pointer table that connects the two tables into a single table. I will now refer to this as "the table" from now on. This is O(1) so we won't count it. Now, choose a value in our new table between 1 and m (to take from the first table's values). This is O(1) also. Now see if your pivot is equal to the (m+1)th value in your table (This is really the first value of table 2, since we concatenated table 2 on the end of table 1). If it is, discard it. This is O(1). If not, temporarily switch your pivot with the (m+1)th spot and quick-sort the values bewteen (m+1) and (m+n) using (m+1) as a pivot. Note this is only one iteration of a QuickSort and we are not actually quicksorting a table, so it takes O(logn). Once in place, check to see if the number to the left or right is equal to the value which was (m+1). This is O(1). If not, you know the value you chose is in table 1 but NOT in table 2. Add this value to table 3 and switch back the pivot. This is also O(1).

Now, we have (O(1)+O(1)+O(1)+O(1)+O(logn)) = O(logn) for one iteration. By progressively doing this for every value in our pointer array (the one concatenating table 1 and 2) between 1 and m, we get \(\displaystyle \L mO(logn) = O(mlogn).\)

How much money do I get? :D :D
-Daon
 
Top