Interpretation of the Articulation Index

The Articulation Index or AI gives a measure of the intelligibility of hearing speech in a given noise environment. The metric was originally developed in 1949 in order to give a single value that categorised the speech intelligibility of a communication system. The basic interpretation of the AI value is the higher the value then the easier it is to hear the spoken word. The AI value is expressed either as a factor in the range zero to unity or as a percentage.

The basic method of evaluating AI uses the concept of an ‘idealised speech spectrum’ and the third octave spectrum levels of the background noise. Essentially if a particular background noise third octave spectrum level is above the corresponding idealised spectrum level then the contribution to AI is zero. If however the difference is positive then it will make a contribution. However, if the difference is greater than 30dB then the contribution is 30dB. Each contribution is multiplied by a weighting factor specific to the particular third octave band. The sum of all the contributions is the AI value. This may be expressed as shown below.

Contribution = IdealisedSpectrumdB[k] - NoiseLeveldB[k] 
If (Contribution < 0.0) Contribution = 0.0
If (Contribution > 30.0 ) Contribution = 30.0
Contribution = Contribution * WeightingFactor[k]

The contribution is found for each third octave band in the region 200Hz to 5kHz and summed to give the AI value.

There is however some confusion as there two separate approaches for calculating the AI value. One method is the strict ANSI S3.5-1969 scheme and the other one is generally known as the vehicle AI value. We distinguish between these as $AI_ANSI and $AI_Veh. The fundamental difference in the calculations is that the ANSI scheme attempts to take account of the existing overall noise level to adjust the levels of the Idealised Spectrum. The idea here is that if the background noise level changes then we speak either louder or softer as appropriate. That is it is strictly concerned with speech intelligibility and is not as concerned with the volume or loudness required. The vehicle version of the AI is concerned with assessing sound quality in the interior environment of the vehicle. Thus it uses what may be described as a fixed target speech spectrum. In consequence the overall level as well as the spectrum shape affect the metric. By convention the $AI_ANSI value is usually given as an index from zero to unity but that $AI_Veh is usually given as a percentage. Figure 1 below shows the ANSI Ideal Speech spectrum, the fixed ‘target’ spectrum for $AI_Veh and a raised version of the ANSI spectrum whose overall matches that of the vehicle target spectrum.

Figure 1: ANSI Ideal Speech , ‘target’ and raised spectrum

The differences in the two principle spectra are obvious. However by comparing the ANSI ‘raised’ spectrum to the vehicle ‘target’ spectrum, it is clear that the vehicle target spectrum is more accommodating at the higher frequencies but less tolerant at the lower frequencies.

The ANSI method uses 65dB as the reference level to adjust for the overall level of the background noise level. If the background noise has an overall level of P dB, then (P -65) dB is added to each idealised spectrum third octave level. That is to a large extent $AI_ANSI is independent of the overall level. This is not the case for $AI_Veh which uses a fixed idealised speech spectrum level.

The ANSI scheme also has an absolute ‘maximum tolerable level’ and a ‘threshold level’ for each third octave band. Thus if any adjusted level is above or below these, then the corresponding limit value is used in the adjusted spectrum. There is also another aspect in the $AI_ANSI calculation for high overall level signals. This is an anechoic correction which basically reduces the idealised speech spectrum so that the $AI_ANSI value falls with very loud background noise levels. The $AI_Veh calculation does not have these factors.

The final differences between the two approaches is that each has different weighting values and $AI_Veh uses an extra third octave band at 6.3kHz. Both sets of weighting values are biased towards the 1.6 and 2kHz bands with the $AI_ANSI being slightly flatter.

Figure 2 below shows the example third octave background noise level given in the ANSI specification. This has an overall level of 75.2dB and an ANSI Articulation Index of 0.547

Figure 2: Standard example AI noise spectrum

The $AI_ANSI and $AI_Veh values were calculated for this spectrum and several identically shaped spectra adjusted to different overall levels. The loudness in Sones was also computed. Results are shown in the table below.

Overall dB$AI_ANSI$AI_Veh (%)Loudness (Sones)

Note The $AI_ANSI value is shown as an index from zero to unity but that $AI_Veh is shown as a percentage.

From the table it is clear that the ANSI AI is sensibly independent of the overall level until the anechoic factors take effect at high overall levels. The Vehicle AI however with its fixed target does vary with overall level. It has essentially an inverse relationship of some form to loudness.

Both AI calculations are valid for the purposes for which they were designed. The ANSI version tests speech intelligibility, the vehicle version tests what may be called normal level speech quality.

The following two tabs change content below.

Dr Colin Mercer

Chief Signal Processing Analyst at Prosig
Dr Colin Mercer was formerly at the Institute of Sound and Vibration Research (ISVR), University of Southampton where he founded the Data Analysis Centre. He then went on to found Prosig in 1977. Colin retired as Chief Signal Processing Analyst at Prosig in December 2016. He is a Chartered Engineer and a Fellow of the British Computer Society.

Latest posts by Dr Colin Mercer (see all)

One thought on “Interpretation of the Articulation Index

  1. Marano

    I have a question and I would like to know if you could help me.
    There were two options in OCTAVETBAI box (later version
    of Prosig) in the Integration Method
    parameter, where you could choose between Exponential and Block . I would like to know what the difference between these two
    parameters, because the new version of Prosig that I have, there isn’t as to choose (block is default). Would be possible answer me in my e-mail:
    Regards and thanks.

Leave a Reply

  1. We welcome any feedback, questions or comments
Optimization WordPress Plugins & Solutions by W3 EDGE