Python Development, memoized functions.
franktaw at netscape.net
franktaw at netscape.net
Fri Jan 5 19:17:55 CET 2007
There are two basic approaches to memoizing:
1) Keep a table of all values up to some N, and if asked to compute
a(n) with n>N, evaluate at each value from N+1 through n (updating N to
n in the process). This can be extended to a multi-dimensional table
for functions of more than one argument.
2) Keep a look up table keyed by all function arguments, with the value
of the function for those arguments.
Each of these has advantages and disadvantages, depending on the
function and the algorithm. (1) tends to be the best choice when a(n)
depends on a(n-1) and perhaps also earlier values; when looking at only
scattered other values, (2) is likely to be better. In some cases, the
recursion may not use descent - that is, for some n, a(n) will depend
on a(m) for some m > n; in such cases, approach (1) will not work.
On the other hand, if the first call to the function involves deep
recursion, that recursion will actually occur using approach (2), with
all the problems that causes. Further, approach (2) usually involves
more overhead than approach (1), which can be important.
(Approach (2) also produces some strange dependencies on the details of
the definition. Assuming left-to-right evaluation order,
F(n) = if n < 2 then n else F(n-1) + F(n-2)
and making a first call to F(1000) will recurse to depth approximately
1000, while
F(n) = if n < 2 then n else F(n-2) + F(n-1)
will recurse to depth approximately 500. The total work required is
comparable for the two definitions, but the first will require more
memory. Of course, if the expression is evaluated right-to-left, this
will reversed - and there is generally no control over order of
evaluation.)
How you are going to use the function can also make a difference. If
you're going to call it for every value from 1 to some n, approach (1)
is probably better if it is workable at all. If you're just going to
be making a few scattered calls, approach (2) may be superior. (This
consideration is secondary, since a library must actually select an
approach, and a generally good one can usually be identified.)
Further, some functions will work best with variations of one of the
basic approaches. For example, if it always holds that a(2n) = a(n),
it may be better to store a(n) only for odd n, and factor out all 2's
before looking up the value.
Summary: one size fits all memoization is not sufficient.
Franklin T. Adams-Watters
P.S. A note on names: using something like "definec" instead of
"define" for a memoized version is not really a good idea. One can
glance at this and not notice the memoization - which is fine some of
the time, but not always. Better to use a name like "definememoize"
(or "define_memoize", or maybe "define_list_memo"), so that both the
similarity to "define" and the difference therefrom are evident.
________________________________________________________________________
Check Out the new free AIM(R) Mail -- 2 GB of storage and
industry-leading spam and email virus protection.
More information about the SeqFan
mailing list