Tuesday, December 5. 2006Steve the Gambler - Caught by the IRSTrackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
I wrote a fairly naive algorithm in ocaml which you can see at
http://rafb.net/paste/results/itG6Ak89.html The basic idea is just to recursively go through the list and for each element, consider solving the problem with/without that element in the solution. Despite the naivety of this solution, finding the answer for 9 days takes between 1 and 2 seconds after compiling with the ocaml optimizing compiler. C:\temp>camlprog -days 9 List is: 87 -40 -65 54 -45 -12 -12 25 8 Sum is: 0 An interesting thing to note is that there is a perfect solution (totaling 0) for any number of days from 3 to 28. Only 1, 2, 29, and 30 don't have perfect solutions. C:\temp>camlprog -days 3 List is: -45 38 7 Sum is: 0 ... C:\temp>camlprog -days 28 List is: -50 -21 13 171 14 -58 109 4 -23 -44 -98 -121 101 33 87 -121 -40 -65 43 54 -45 -12 -12 38 25 3 7 8 Sum is: 0
Wouldn't any solution less than zero (a net loss gambling) result in no fine for Steve? Thus -20 dollars would be better than +5 dollars?
Lets imagine that the IRS fine you based on how incorrect your taxes were, in either direction, okay
Used Erlang, found several solutions that worked. Here's one:
[-12,-12,-23,-50,54,25,3,7,8]
here's an erlang solution that runs in around 120 microseconds on my powerbook
http://rafb.net/paste/results/PAoNnd38.html I basically took andrews solution and speed it up by some orders of magnitude by removing the without path when it wasn't necessary to run
Since the url above will expire in 24 hours I figured I better just paste it here...
-module(irs). -compile(export_all). data() -> [-50,-21,13,171,14,-42,-58,109,4,7,-23,-44,-98,-121,101, 33,87,-121,-40,-65,43,54,-45,-12,-12,38,25,3,7,8]. go() -> find(9, data(), 0). find(N, L, _) when N == 0;L == [] -> []; find(N, L, _) when length(L) == N -> L; find(N, [H|T], S) -> With = [H | find(N-1, T, S - H)], case lists:sum(With) of S -> With; WS -> Without = find(N, T, S), case abs(WS - S) < abs(lists:sum(Without) - S) of true -> With; _ -> Without end end.
Cheap, cheap hack that nevertheless found the solution in under two minutes. Should not scale well, though.
-- module IRS where import Random import Control.Monad import Data.List select [] x = [] select (l:ls) x = (x !! l) : (select ls x) sumSelect l x = sum (select l x) getList n = [getStdRandom (randomR (1,9)) >>= return | x>=(putStr . (++" ") . (" "++) . show )) (getList n) main = do { x
Bendick, your solution still has the problem that it goes down paths that are not needed...
take this version for example, it runs in 68k of memory rather than 25mb and is 3 orders of magnitude faster even without the memoization (define (steve days off-by l) (if (zero? days) (list off-by) (if (null? l) #f (let ((first-try (steve (- days 1) (+ off-by (car l)) (cdr l)))) (if (and first-try (= 0 (car first-try))) (cons (car first-try) (cons (car l) (cdr first-try))) (let ((second-try (steve days off-by (cdr l)))) (if (not first-try) second-try (let ((first-fixed (cons (car first-try) (cons (car l) (cdr first-try))))) (if (or (not second-try) (< (abs (car first-try)) (abs (car second-try)))) first-fixed second-try))))))))) (define (answer days list-of-days) (cdr (steve days 0 list-of-days))) (time (answer 9 '(-50 -21 13 171 14 -42 -58 109 4 7 -23 -44 -98 -121 101 33 87 -121 -40 -65 43 54 -45 -12 -12 38 25 3 7 8 ))) 3 ms real time 2 ms cpu time (2 user, 0 system) no collections 68224 bytes allocated no minor faults no major faults
Hi Jason. I thought about checking to see if the first try was exactly correct and only try the second otherwise, but it doesn't change the asymptotic runtime.
Furthermore, without the memoization, your code is exponetial. Try running it on a dataset that doesn't have any solution exactly equal to zero. Try the example dataset with just positive values. (time (answer 9 (map abs '(-50 -21 13 171 14 -42 -58 109 4 7 -23 -44 -98 -121 101 33 87 -121 -40 -65 43 54 -45 -12 -12 38 25 3 7 Let me know if it ever returns. =]
Des, you've gotta nix the mandatory smileys on comments. =]
The erlang solution runs in around 8 seconds with an all positive data set which while slower isn't all that bad.
I'd never done scheme before yesterday so I'm not sure why the other version is so much slower in the worst case I'd ask if you'd prefer a solution that took 2ms for the cases where there is a solution and 8sec when there is not one or 2 seconds for all of them but I suppose you could mix and match and have both worlds.
My Java solution is admittedly rather low-level, even for Java (would look almost the same in C), but with the example data, it takes just 66ms - for 10000 iterations. The worst case is at 500ms for 1 iteration. Six orders of magnitude difference!
public class SteveTheGambler { final int[] monthsTakings = {-50,-21,13,171,14,-42,-58,109,4,7,-23,-44,-98,-121,101, 33,87,-121,-40,-65,43,54,-45,-12,-12,38,25,3,7,8,}; final int n=9; int bestSum = Integer.MAX_VALUE; int[] bestSelection = new int[n]; public void compute() { select(new int[n], 0,0,0); } private void select(int[] selection, int i, int sum, int nSelected) { if(nSelected == n) { if(Math.abs(sum) < Math.abs(bestSum)) { bestSum = sum; System.arraycopy(selection, 0, bestSelection, 0, n); } return; } for( ; i
and here's the second half... (HTML formatting!)
for( ; i <= monthsTakings.length-(n-nSelected); ++i) { int elt = monthsTakings[i]; selection[nSelected] = elt; select(selection, i+1, sum+elt, nSelected+1); if(bestSum == 0) return; }; } }
Just for the record, this problem is known to be NP-complete, http://en.wikipedia.org/wiki/Subset_sum_problem
Bendick's memoization approach disappeared so I didn't get a chance to look at it, but there are some ways to further optimize the search solutions.
For example, by breaking apart the positive and negative lists and sorting them, it restricts the number of options to try at any point. If a running sum is positive, we know the any solution will contain a negative number, and vice versa. If we have a positive sum and no negative numbers, the best thing to do is to just take the smallest remining positive numbers and return immediately. That ends up cutting a lot of branches out of the search space. HTML version of the code: http://www.theaboutbox.com/code/steve-irs You can download the actual code here. http://www.theaboutbox.com/code/steve-irs/steve-irs.ml This algorithm seems to take about the same amount of time whether or not there is a solution. (2ms on my machine, FWIW) Thinking about the problem in the context of a mobile phone network may beg a different solution. If this maps onto some sort of routing or flow control algorithm, I can imagine it being more important to come up with a good solution in a fixed period of time (with a much larger dataset) than converge on the exact solution...
Hi Cameron,
Bendicks code is here: http://destraynor.com/bendick.txt I had to move it as the formatting was causing oddities in the page display. Des
I've posted a solution in perl here: http://www.lazycodemonkey.com/?p=37
It was fairly challenging - maybe because I haven't written anything like this in a while. About 50 ms total time to run. Out of curiosity, how long did it take you guys to write this?
Delphi Implementation takes 219ms to find all 33494 Solutions, 0ms to find the first.
|
About:Switch to Dark on Light!
This website is the online diary of me, Des Traynor, a User Experience Researcher in Dublin, Ireland. I work with Contrast. I usually write on 5 topics: I update about 3-4 times per month. Be sure to subscribe so you don't miss this good stuff. If this is your first time here, check out the archives.My official homepage provides more information about who I am, and what I research. You can contact me at destraynor [at] gmail [dot] com Quicksearch |