Psychoacoustical models have been used extensively within audio coding applications over the past decades.Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals.In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions.The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking.
As a consequence, the Maschera Viso model is able to predict the distortion detectability.In fact, the distortion detectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding.We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model.Listening tests show a clear preference for the new model.
More specifically, SHOPSTORM_HIDDEN_PRODUCT the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.