Sometimes we have digitised data at a much higher rate than we need. How can we downsample data? If I wanted to say halve the sample rate can I just throw away every other data point?
The answer is NO, except in pathological conditions where you know that there is no frequency content above the new Nyquist frequency.
What is the Nyquist frequency?
The Nyquist frequency is simply the frequency which is half the sample rate. If we had an original sample rate of say 20000 samples/second then the Nyquist frequency is 10000Hz. What this says is that the theoretical bandwidth of our signal is from 0 to 10000Hz. In practice, we would have used anti-aliasing filters as part of our digitisation process.
What are anti-aliasing filters?
These are simply low pass analogue filters which limit the frequency content of our signal so that we do not get any aliasing.
So what is aliasing and how is all this related to my need to downsample data?
To answer the second point first, if we decimate we are again sampling, the term often used is downsampling. That is we will get aliasing if we just throw away intermediate data points. Now to answer the first questions, what is aliasing, this is best illustrated by a simple example. Suppose we have a very simple signal made up of 2 sinewaves, one at 15Hz and another smaller one at 120Hz and suppose we have sampled at say 512 samples/second. If we frequency analyse the signal we would get something like that shown in Figure 1.
The peaks at 15Hz and 120Hz are clearly identifiable. Suppose now we decimate by a factor of 4. We can do this in DATS using Copy Section of Dataset which is in the Data Manipulation menu. Our new sampling rate is 128 samples/second and our new Nyquist frequency is 64Hz. The frequency analysis of the (incorrectly) decimated signal is shown below.
We still have the signal at 15Hz just as before but what is the new one at 8Hz?
Well that is the 120Hz signal which has aliased to appear at 8Hz. This is a simple alias and the alias frequency in this case will be (new sample rate – original frequency) which is 8Hz. The situation may be much worse as we may have a multiple alias. For instance, if we also had a signal at 300Hz after digitisation at 512 samples/second it would appear at 212Hz and then after (incorrect) decimation it would be at 20Hz.
The answer to the original question of how do we decimate is now, hopefully, obvious. Firstly, we must low pass filter the signal (anti-alias filter) and then we can throw away the intermediate points. DATS for Windows module DECIMATE does both operations in one go. If we do this on our original signal and then frequency analyse we get the frequency spectrum as given in Figure 3.
Well actually I am not interested in frequency analysis as I am only doing fatigue analysis. Do I have to bother with all of this anti-aliasing stuff?
Fatigue analysis, which is also available as an option in DATS for Windows, uses the peaks and troughs of signals (the “turning points”). Now when we digitise a signal at a given rate we are also stating that we are not interested in any frequencies above the Nyquist frequency. This may be because we “know” that there are no frequencies above the Nyquist or that there is no physical significance to them. This view point neglects practical effects such as extraneous noise, mains power interference and the like in the transducers and their conditioning. For instance, a common disturbance is the second harmonic of the mains frequency. If you are using a 60Hz supply this is at 120Hz! If we look at our original signal as a time history we can see the constructive and destructive combination effect of the 120Hz sinewave as the 15Hz wave. This obviously affects the peaks and troughs. But now suppose we did not look at the original signal but just looked at the incorrectly decimated signal.
The 120Hz interference now looks like a gentle 8Hz and could be quite realistic. We could spend a lot of time trying to identify this signal.
Thus even for fatigue calculations when we are not looking at frequency analysed data we still have to obey the anti-aliasing rules. In most cases, fatigue data is sampled at 10 times the highest expected frequency so aliasing is minimised. The reason for the oversampling is to allow better definition of the peaks and troughs. But this brings us back full circle as the reason for decimating was to reduce the oversampling. This then leads on to the question of do I need to oversample in the first place? It may be surprising to some people but actually there is no need to oversample. Provided we have obeyed the Nyquist criterion by using an anti-aliasing filter then as well as being able to downsample we can also upsample. That is, there is no actual need to oversample by 10 times as we can perfectly reconstruct. This is most useful when archiving as we can save a significant amount of space.
Chief Signal Processing Analyst (Retired) at Prosig
Dr Colin Mercer was formerly at the Institute of Sound and Vibration Research (ISVR), University of Southampton where he founded the Data Analysis Centre. He then went on to found Prosig in 1977. Colin retired as Chief Signal Processing Analyst at Prosig in December 2016. He is a Chartered Engineer and a Fellow of the British Computer Society.