This notebook illustrates amassing a medium-sized dataset (1 second of 32- or 64-bit float mono audio, essentially – a simple sine wave) using the record_history decorator, and then using the stats.history_as_DataFrame attribute to obtain that dataset as a Pandas DataFrame.

After that basic (and naive) example, we compare the performance of this approach to one that uses numpy's ability to vectorize the same computation, and conclude that if you can vectorize, then by all means do so. (The example is naive precisely because no one would call the function f below in a for loop when it's possible to vectorize with numpy universal functions (ufuncs). When that's not an option, however, record_history can come in handy.)

Using the stats.history_as_DataFrame attribute

In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from log_calls import record_history
In [2]:
@record_history()
def f(freq, t):
    return np.sin(freq * 2 * np.pi * t)
In [3]:
ran_t = np.arange(0.0, 1.0, 1/44100, dtype=np.float32)
ran_t
Out[3]:
array([  0.00000000e+00,   2.26757365e-05,   4.53514731e-05, ...,
         9.99931931e-01,   9.99954641e-01,   9.99977291e-01], dtype=float32)

Now, naively, call f 44,100 times in a for loop, and obtain its call history as a Pandas DataFrame:

In [4]:
#f.stats.clear_history()
for t in ran_t: 
    f(17, t)
df = f.stats.history_as_DataFrame

Examine and do stuff with it:

In [5]:
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 44100 entries, 1 to 44100
Data columns (total 8 columns):
freq              44100 non-null int64
t                 44100 non-null float64
retval            44100 non-null float64
elapsed_secs      44100 non-null float64
CPU_secs          44100 non-null float64
timestamp         44100 non-null object
prefixed_fname    44100 non-null object
caller_chain      44100 non-null object
dtypes: float64(4), int64(1), object(3)
memory usage: 3.0 MB

In [6]:
from IPython.display import display
display(df.head())
display(df.tail())
freq t retval elapsed_secs CPU_secs timestamp prefixed_fname caller_chain
call_num
1 17 0.000000 0.000000 0.000040 0.000040 11/29/14 00:12:24.155204 'f' ['<module>']
2 17 0.000023 0.002422 0.000016 0.000017 11/29/14 00:12:24.155441 'f' ['<module>']
3 17 0.000045 0.004844 0.000015 0.000015 11/29/14 00:12:24.155632 'f' ['<module>']
4 17 0.000068 0.007266 0.000015 0.000015 11/29/14 00:12:24.155802 'f' ['<module>']
5 17 0.000091 0.009688 0.000014 0.000014 11/29/14 00:12:24.155968 'f' ['<module>']
freq t retval elapsed_secs CPU_secs timestamp prefixed_fname caller_chain
call_num
44096 17 0.999887 -0.012109 0.000014 0.000014 11/29/14 00:12:32.091139 'f' ['<module>']
44097 17 0.999909 -0.009690 0.000014 0.000014 11/29/14 00:12:32.091301 'f' ['<module>']
44098 17 0.999932 -0.007271 0.000014 0.000013 11/29/14 00:12:32.091473 'f' ['<module>']
44099 17 0.999955 -0.004845 0.000014 0.000013 11/29/14 00:12:32.091644 'f' ['<module>']
44100 17 0.999977 -0.002426 0.000014 0.000013 11/29/14 00:12:32.091808 'f' ['<module>']
In [7]:
len(f.stats.history)
Out[7]:
44100
In [8]:
df[['t', 'retval']].head()
Out[8]:
t retval
call_num
1 0.000000 0.000000
2 0.000023 0.002422
3 0.000045 0.004844
4 0.000068 0.007266
5 0.000091 0.009688
In [9]:
plt.plot(df.t, df.retval);

Comparing performance: record_history vs vectorization with numpy ufuncs

(1) No decorator

With no decorator, it's ~2.8 orders of magnitude faster to use a function as a numpy ufunc than to call it in a for loop (looping through ran_t)

In [33]:
def g(freq, t):
    return np.sin(freq * 2 * np.pi * t)

Vectorized:

In [34]:
Hz_17 = g(17, ran_t)
nodeco_vectorized_secs = %timeit -o Hz_17 = g(17, ran_t)
nodeco_vectorized_secs.best
1000 loops, best of 3: 445 µs per loop

Out[34]:
0.00044526603399572197
In [35]:
Hz_17
Out[35]:
array([ 0.        ,  0.00242209,  0.00484416, ..., -0.00727302,
       -0.00484692, -0.00242842], dtype=float32)
In [36]:
plt.plot(Hz_17);

With a for loop:

In [37]:
nodeco_loop_secs = %timeit -o for t in ran_t: g(7, t)
nodeco_loop_secs.best
1 loops, best of 3: 255 ms per loop

Out[37]:
0.2549935089991777
In [38]:
def comparison(slower, faster):
    'slower, faster: seconds'
    ratio = slower/faster
    order_of_magnitude = np.log10(ratio)
    return ratio, order_of_magnitude
In [39]:
print("With no decorator:\n"
      "Vectorized approach is %d times (about %.1f orders of magnitude) faster" 
      % comparison(slower=nodeco_loop_secs.best, faster=nodeco_vectorized_secs.best))
With no decorator:
Vectorized approach is 572 times (about 2.8 orders of magnitude) faster

Now let's compare the performance of the record_history-decorated version (f) of the same function.

(2a) record_history disabled with "true bypass" (enabled setting < 0)

With record_history disabled and "true-bypassed", it's 3.6 orders of magnitude faster to use a function as a numpy ufunc than to call it in a for loop (looping through ran_t)

In [40]:
f.stats.clear_history()
f.record_history_settings.enabled = -1

Vectorized:

In [41]:
vectorized_secs_rh_bypass = %timeit -o Hz_17 = f(17, ran_t)
vectorized_secs_rh_bypass.best
1000 loops, best of 3: 493 µs per loop

Out[41]:
0.0004927381759989658

With a for loop:

In [42]:
loop_secs_rh_bypass = %timeit -o for t in ran_t: f(7, t)
loop_secs_rh_bypass.best
1 loops, best of 3: 2.07 s per loop

Out[42]:
2.069394628997543
In [43]:
print("With record_history bypassed:\n"
      "Vectorized approach is %d times (about %.1f orders of magnitude) faster" 
      % comparison(slower=loop_secs_rh_bypass.best, faster=vectorized_secs_rh_bypass.best))
With record_history bypassed:
Vectorized approach is 4199 times (about 3.6 orders of magnitude) faster

Decorator overhead: about 8x (~0.9 orders of magnitude)

In [44]:
print("Called in a for-loop, the no-decorator version is %d times (about %.1f orders of magnitude) faster" 
      % comparison(slower=loop_secs_rh_bypass.best, faster=nodeco_loop_secs.best))
Called in a for-loop, the no-decorator version is 8 times (about 0.9 orders of magnitude) faster

(2b) record_history disabled (enabled == 0, without "true bypass")

With record_history disabled, it's 3.6 orders of magnitude faster to use a function as a numpy ufunc than to call it in a for loop (looping through ran_t)

In [45]:
f.record_history_settings.enabled = False

Vectorized:

In [46]:
vectorized_secs_rh_disabled = %timeit -o Hz_17 = f(17, ran_t)
vectorized_secs_rh_disabled.best
1000 loops, best of 3: 491 µs per loop

Out[46]:
0.000491085498004395

With a for loop:

In [47]:
loop_secs_rh_disabled = %timeit -o for t in ran_t: f(7, t)
loop_secs_rh_disabled.best
1 loops, best of 3: 2.09 s per loop

Out[47]:
2.0938441199978115
In [48]:
print("With record_history disabled:\n"
      "Vectorized approach is %d times (about %.1f orders of magnitude) faster" 
      % comparison(slower=loop_secs_rh_disabled.best, faster=vectorized_secs_rh_disabled.best))
With record_history disabled:
Vectorized approach is 4263 times (about 3.6 orders of magnitude) faster

4263x "disabled" vs 4199x disabled with "true bypass" – so "true bypass" provides a very slight speedup.

(3) record_history enabled

With record_history enabled, it's 4.1 orders of magnitude faster to use a function as a numpy ufunc than to call it in a for loop (looping through ran_t)

In [49]:
f.record_history_settings.enabled = True

Vectorized:

In [50]:
vectorized_secs_rh_enabled = %timeit -o f.stats.clear_history(); Hz_17 = f(17, ran_t)
vectorized_secs_rh_enabled.best
1000 loops, best of 3: 664 µs per loop

Out[50]:
0.000664062632000423
In [51]:
len(f.stats.history)
Out[51]:
1
In [52]:
def size_of_t_for_row(row):
    return f.stats.history[row].argvals[1].size

size_of_t_for_row(0)
Out[52]:
44100
In [53]:
f.stats.history[0].retval.size
Out[53]:
44100

With a for loop:

In [54]:
f.stats.clear_history()
loop_secs_rh_enabled = %timeit -o for t in ran_t: f(7, t); f.stats.clear_history()
loop_secs_rh_enabled.best
1 loops, best of 3: 8.58 s per loop

Out[54]:
8.57765375800227
In [55]:
print("With record_history enabled:\n"
      "Vectorized approach is %d times (about %.1f orders of magnitude) faster" 
      % comparison(slower=loop_secs_rh_enabled.best, faster=vectorized_secs_rh_enabled.best))
With record_history enabled:
Vectorized approach is 12916 times (about 4.1 orders of magnitude) faster