Modeling the way human ears perceive sound has resulted in microphones and software that can better select a person’s voice while filtering out background noise. The inner ear filters sound waves based on their shape and frequency, containing groups of auditory nerve cells that respond to specific intervals of frequencies known as critical bands. Multiple overlapping critical bands combine via auditory masking to form an excitation pattern that the brain perceives as loudness. Frequency-weighting techniques, such as those commonly used in broadcasting, do not consider masking effects and perform poorly amidst loudness. Additionally, they cannot be used to calculate excitation patterns needed for digital compression or volume control in hearing aids. Recently, more elaborate models have been developed that imitate the ear. However, they rely on large banks of critical band-pass filters and are computationally expensive, making them unsuitable for real-time applications.
Researchers at ASU have developed a method for accurately and efficiently estimating excitation patterns and loudness with real-time functionality. This method reduces the filter bank size by iteratively selecting only the perceptually relevant components of sound waves, such that the total neural activity is preserved and the general shape of the excitation pattern is initially captured. Critical bands are then chosen based on their relevancy to the general shape, resulting in greater accuracy of estimated excitation patterns and loudness measures with minimal computational overhead. This method can be applied to loudness measurement systems, perceptual loudness based volume control, loudness equalization systems, and digital audio encoding schemes.
- Audio Coding
- Cloud Voice
- Hearing Aids
- Speech Recognition
- Voice Over IP
Benefits and Advantages
- Accurate – Iterative reevaluation of critical bands results in greater accuracy of excitation patterns and loudness measurements.
- Efficient – Calculates final loudness estimate and intermediate quantities with minimal computational overhead.
- Versatile – Can be used for computationally demanding encoding schemes as well as rapid real-time streaming.
- Saves Power – Ideal for light-weight software processes or portable audio devices.
For more information about the inventor(s) and their research, please see