*[ Update 8 August 2012: We found that for KissFFT if the size of the input is not a power of 2, 3, and 5, then things really slow down. So now we’re padding the input to the next power of 2.]*

*[ Update 6 July 2012: It turns out there’s a slight correction needed to what I originally wrote. The correction is described on this page:*

- Literate Musings Blog: Autocorrelation Functions

*I’m fixing the presentation below to reflect the correction. The change is also reflected in the updated Stan code using Eigen, but not the updated Kiss FFT code.]*

Suppose you need to compute all the sample autocorrelations for a sequence of observations

`x = x[0],...,x[N-1]`

The most efficient way to do this is with the discrete fast fourier transform (FFT) and its inverse; it’s versus for the naive approach. That much I knew. I had both experience with Fourier transforms from my speech reco days (think spectograms) and an understanding of the basic functional analysis principles. I didn’t know how to code it given an FFT library. The web turned out to be not much help — the explanations I found were all over my head complex-analysis-wise and I couldn’t find simple code examples.

Matt Hoffman graciously volunteered to give me a tutorial and wrote an initial prototype. It turns out to be really really simple once you know which way ’round the parts all go.

### Autocorrelations via FFT

Conceptually, the input `N`

-vector `x`

is the time vector and the autocorrelations will be the frequency vector. Here’s the algorithm:

- create a centered version of
`x`

by setting`x_cent = x / mean(x)`

; - pad
`x_cent`

at the end with entries of value 0 to get a new vector of length`L = 2^ceil(log2(N))`

; - run
`x_pad`

through a forward discrete fast fourier transform to get an`L`

-vector`z`

of complex values; - replace the entries in
`z`

with their norms (the norm of a complex number is the real number resulting of summing the squared real component and squared imaginary component). - run
`z`

through the inverse discrete FFT to produce an`L`

-vector`acov`

of (unadjusted) autocovariances; - trim
`acov`

to size`N`

; - create a
`L`

-vector named`mask`

consisting of`N`

entries with value`1`

followed by`L-N`

entries with value`0`

; - compute the forward FFT of
`mask`

and put the result in the`L`

-vector`adj`

- to get adjusted autocovariance estimates, divide each entry
`acov[n]`

by`norm(adj[n])`

, where`norm`

is the complex norm defined above; and - to get autocorrelations, set
`acorr[n] = acov[n] / acov[0]`

(`acov[0]`

, the autocovariance at lag 0, is just the variance).

The autocorrelation and autocovariance `N`

-vectors are returned as `acorn`

and `acov`

respectively.

It’s really fast to do all of them in practice, not just in theory.

Depending on the FFT function you use, you may need to normalize the output (see the code sample below for Stan). Run a test case and make sure that you get the right ratios of values out in the end, then you can figure out what the scaling needs to be.

### Eigen and Kiss FFT

For Stan, we started out with a direct implementation based on Kiss FFT.

- Stan’s original Kiss FFT-based source code (C/C++)
*[***Warning:**this function does not have the correction applied; see the current Stan code linked below for an example]

At Ben Goodrich’s suggestion, I reimplemented using the Eigen FFT C++ wrapper for Kiss FFT. Here’s what the Eigen-based version looks like:

- Stan’s autocorrelation source code (C++)

As you can see from this contrast, nice C++ library design can make for very simple work on the front end.

Hat’s off to Matt for the tutorial, Thibauld Nion for the nice blog post on the mask-based correction, Kiss FFT for the C implementation, and Eigen for the C++ wrapper.