This paper investigates the relevance of using a large number of Mel Frequency Cepstral Coefficients (MFCC) as descriptors of acoustic signals, and the interaction between these and the nature of the frequency band in which the Mel filters are arranged.
This study forms part of the wider field of automatic recognition of acoustic signals, with a particular focus on those that are not speech-related. We evaluated a series of MFCCs, spanning a range from 1 to 50, utilising the central octave band frequencies (31.5 Hz-16000 Hz) as the MFCC calculation frequencies. An application was made to the identification of chainsaw sounds among a plurality of signals from the forest environment.
The results revealed a threshold value for the number of MFCCs (LVMFCC) above which classification rates remain constant. The LVMFCC=39 was common to all frequencies, although specifically the LVMFCC for each centre frequency was between 5 and 39 MFCCs. We observed that the notion of an optimal value for the number of MFCCs could appear subjective. The best classification rate of 98.41% obtained with the 16000 Hz centre frequency corresponds to a number of MFCCs between 5 and 50. These results also reveal the need to restructure the.