Python Development, memoized functions.

Fri Jan 5 19:17:55 CET 2007

There are two basic approaches to memoizing:

1) Keep a table of all values up to some N, and if asked to compute 
a(n) with n>N, evaluate at each value from N+1 through n (updating N to 
n in the process).  This can be extended to a multi-dimensional table 
for functions of more than one argument.

2) Keep a look up table keyed by all function arguments, with the value 
of the function for those arguments.

Each of these has advantages and disadvantages, depending on the 
function and the algorithm.  (1) tends to be the best choice when a(n) 
depends on a(n-1) and perhaps also earlier values; when looking at only 
scattered other values, (2) is likely to be better.  In some cases, the 
recursion may not use descent - that is, for some n, a(n) will depend 
on a(m) for some m > n; in such cases, approach (1) will not work.

On the other hand, if the first call to the function involves deep 
recursion, that recursion will actually occur using approach (2), with 
all the problems that causes.  Further, approach (2) usually involves 
more overhead than approach (1), which can be important.

(Approach (2) also produces some strange dependencies on the details of 
the definition.  Assuming left-to-right evaluation order,
F(n) = if n < 2 then n else F(n-1) + F(n-2)
and making a first call to F(1000) will recurse to depth approximately 
1000, while
F(n) = if n < 2 then n else F(n-2) + F(n-1)
will recurse to depth approximately 500.  The total work required is 
comparable for the two definitions, but the first will require more 
memory.  Of course, if the expression is evaluated right-to-left, this 
will reversed - and there is generally no control over order of 
evaluation.)

How you are going to use the function can also make a difference.  If 
you're going to call it for every value from 1 to some n, approach (1) 
is probably better if it is workable at all.  If you're just going to 
be making a few scattered calls, approach (2) may be superior.  (This 
consideration is secondary, since a library must actually select an 
approach, and a generally good one can usually be identified.)

Further, some functions will work best with variations of one of the 
basic approaches.  For example, if it always holds that a(2n) = a(n), 
it may be better to store a(n) only for odd n, and factor out all 2's 
before looking up the value.

Summary: one size fits all memoization is not sufficient.

Franklin T. Adams-Watters

P.S. A note on names: using something like "definec" instead of 
"define" for a memoized version is not really a good idea.  One can 
glance at this and not notice the memoization - which is fine some of 
the time, but not always.  Better to use a name like "definememoize" 
(or "define_memoize", or maybe "define_list_memo"), so that both the 
similarity to "define" and the difference therefrom are evident.
________________________________________________________________________
Check Out the new free AIM(R) Mail -- 2 GB of storage and 
industry-leading spam and email virus protection.