Python Development, memoized functions.

Fri Jan 5 02:17:21 CET 2007

Alec Mihailovs wrote:

> From: "Antti Karttunen" <antti.karttunen at gmail.com>
>
>> @M
>> def A000045(n):  if(n < 2): return(n) else: 
>> return(A000045(n-1)+A000045(n-2))
>
>
> Another way of doing this is to define A000045_list first (that saves 
> all the calculated values in a list), and then define A000045 as the 
> last element of the list. Such as
>
> def A000045_list(n):
> result=[0]
> a,b=0,1
> for i in range(n):
>     result.append(b)
>     a,b=b,a+b
> return result
>
> def A000045(n): return A000045_list(n)[-1]
>
> That should calculate values faster than the recursive procedure with 
> memoization because that avoids multiple function calls.  

What is the ratio of speedwise improvement of the latter list-version 
vs. the former memoized version,
compared to the ratio of speedwise improvent of the version written in 
assembly language vs. the above Python list-version?
 And compare similar ratios for the legibility and the ease of use.
(My point: in any case, Python is slow compared to assembly, C or even 
Scheme compiled into native code.)

Point against my view ("Advocatus Diaboli"): An one-liner like:

  def A000045(n):  if(n < 2): return(n) else: 
return(A000045(n-1)+A000045(n-2))

is not anymore "program", it's a formula/definition, which does not give 
any more information
than the %F-line itself: a(0)=0, a(1)=1, a(n) = a(n-1)+a(n-2).

"Promotor Fidei": On the other hand, if it can be computed efficiently, 
then it's a very good proram.
(Brief and very legible at the same time.) Why should we specify/need 
some ancillary
information concerning allocation of lists et cetera, when that doesn't 
affect the mathematical
definition of the function, just makes the implementation more error-prone?

> Alec Mihailovs wrote:
> From: "Alec Mihailovs" <alec at mihailovs.com>
> To: "Antti Karttunen" <antti.karttunen at gmail.com>
>
>>> What will happen when you call A000045 multiple times, with varying 
>>> n (increasing and decreasing
>>> at times)?
>>
>>
>> In this case, certainly, memoization has an advantage.
>
>> From other point of view, that also can be done as assigning 
>
> A000045=A000045_list(1000)
> or with other than 1000 number, greater than the indices needed, and then
> using A000045[n] instead of A000045(n) - that would be the same as 
> with memoization.

How do you know what will be the maximum index one needs?
E.g. if somebody wants to calculate something like 
http://www.research.att.com/~njas/sequences/A001605 ?
Or http://www.research.att.com/~njas/sequences/A003714
which itself might be called from yet another function?
In the latter case the calls to A000045 doesn't need to occur
in order A000045(i), A000045(i+1), A000045(i+2), ...
but could happen in any order. (With Fibonacci's we know that
the "memo-vector" will not contain "holes", but this is not case
in general, with functions that involve recursions like
a(n) = a(floor(n/2)) + ... )

>
> Alec
>

Antti