*[ Update 8 August 2012: We found that for KissFFT if the size of the input is not a power of 2, 3, and 5, then things really slow down. So now we're padding the input to the next power of 2.]*

*[ Update 6 July 2012: It turns out there's a slight correction needed to what I originally wrote. The correction is described on this page:*

- Literate Musings Blog: Autocorrelation Functions

*I'm fixing the presentation below to reflect the correction. The change is also reflected in the updated Stan code using Eigen, but not the updated Kiss FFT code.]*

Suppose you need to compute all the sample autocorrelations for a sequence of observations

`x = x[0],...,x[N-1]`

The most efficient way to do this is with the discrete fast fourier transform (FFT) and its inverse; it’s versus for the naive approach. That much I knew. I had both experience with Fourier transforms from my speech reco days (think spectograms) and an understanding of the basic functional analysis principles. I didn’t know how to code it given an FFT library. The web turned out to be not much help — the explanations I found were all over my head complex-analysis-wise and I couldn’t find simple code examples.

Matt Hoffman graciously volunteered to give me a tutorial and wrote an initial prototype. It turns out to be really really simple once you know which way ’round the parts all go.

### Autocorrelations via FFT

Conceptually, the input `N`

-vector `x`

is the time vector and the autocorrelations will be the frequency vector. Here’s the algorithm:

- create a centered version of
`x`

by setting`x_cent = x / mean(x)`

; - pad
`x_cent`

at the end with entries of value 0 to get a new vector of length`L = 2^ceil(log2(N))`

; - run
`x_pad`

through a forward discrete fast fourier transform to get an`L`

-vector`z`

of complex values; - replace the entries in
`z`

with their norms (the norm of a complex number is the real number resulting of summing the squared real component and squared imaginary component). - run
`z`

through the inverse discrete FFT to produce an`L`

-vector`acov`

of (unadjusted) autocovariances; - trim
`acov`

to size`N`

; - create a
`L`

-vector named`mask`

consisting of`N`

entries with value`1`

followed by`L-N`

entries with value`0`

; - compute the forward FFT of
`mask`

and put the result in the`L`

-vector`adj`

- to get adjusted autocovariance estimates, divide each entry
`acov[n]`

by`norm(adj[n])`

, where`norm`

is the complex norm defined above; and - to get autocorrelations, set
`acorr[n] = acov[n] / acov[0]`

(`acov[0]`

, the autocovariance at lag 0, is just the variance).

The autocorrelation and autocovariance `N`

-vectors are returned as `acorn`

and `acov`

respectively.

It’s really fast to do all of them in practice, not just in theory.

Depending on the FFT function you use, you may need to normalize the output (see the code sample below for Stan). Run a test case and make sure that you get the right ratios of values out in the end, then you can figure out what the scaling needs to be.

### Eigen and Kiss FFT

For Stan, we started out with a direct implementation based on Kiss FFT.

- Stan’s original Kiss FFT-based source code (C/C++)
*[***Warning:**this function does not have the correction applied; see the current Stan code linked below for an example]

At Ben Goodrich’s suggestion, I reimplemented using the Eigen FFT C++ wrapper for Kiss FFT. Here’s what the Eigen-based version looks like:

- Stan’s autocorrelation source code (C++)

As you can see from this contrast, nice C++ library design can make for very simple work on the front end.

Hat’s off to Matt for the tutorial, Thibauld Nion for the nice blog post on the mask-based correction, Kiss FFT for the C implementation, and Eigen for the C++ wrapper.

June 8, 2012 at 9:44 pm |

Nice point! Using FFTs to work in the frequency domain is also useful if you have to search for the time lag that maximises the correlation of a pair of signals, I think. Padding with zeros is important to avoid wrap-around errors.

June 10, 2012 at 11:43 am |

Yes, Fourier analysis is used to discover time periodicities. That’s sort of what it’s all about! If you think about linguistic issues like tracking formants in the frequency domain, it’s all about discovering maximal time periodicities.

Mitzi’s fascinated by George Church’s lab’s analysis of circadian rhythms of fruit flies — that involved Fourier analysis.

For natural language data, I’d think you could tease apart daily, weekly, monthly and annual cycles using raw data containing frequency of relevant terms like “weekend” and “Saturday” and “New Year’s” (though it does bring up that different cultures have different start of year dates and 24 time zones could be problematic. In fact, it’d make a neat project to do this the other way around. Calculate the autocorrelations of a bunch of terms and figure out which ones are periodic.

June 12, 2012 at 4:58 am

I actually did a simple version of what you just suggested years ago — we had a couple of years worth of newswire text with a date assigned to the day. We looked for words which were strongly auto-correlated with a 7-day lag; sports and financial terms pop out, if I recall correctly. Celebrations, holidays and financial things like taxes jump out if you look for a 365-day lag. We just looked at a few fixed lags (and we also tried correlating with the day of the month, which doesn’t correspond to a fixed lag as different months have different numbers of days).

But of course with an FFT we could have looked across all lags simultaneously — and we probably would have been able to publish on the basis of the cool maths it would have contained!

June 12, 2012 at 1:11 pm |

Oh wow, this certainly brings back some memories. I havent looked at FFT in many years but this technique was known as the fast way to compute autocorrelations in the DSP classes I took…. Take a look at Page 68 of the Optimum Signal Processing textbook.

http://eceweb1.rutgers.edu/~orfanidi/osp2e/

June 13, 2012 at 3:44 pm |

Yes, it’s very well known by the signal processing folks and the computational statisticians. It’s just that when I asked Matt, he said “it’s a consequence of the Wiener-Khinchin theorem,” which wasn’t quite enough for me to get to an algorithm.