Above-ground biomass estimation in a Mediterranean sparse coppice oak forest using Sentinel-2 data

Fardin Moradi; Seyed Mohamad Moein Sadeghi; Hadi Beygi Heidarlou; Azade Deljouei; Erfan Boshkar; Stelian Alexandru Borz

doi:10.15287/afr.2022.2390

Research article

BAG-OF-AUDIO-VISUAL WORDS BASED APPROACH FOR SOUND EVENT AND ACOUSTIC SCENE RECOGNITION TASKS FOR INDUSTRIAL MACHINERIES

S Chandrakala, Sreenithi B, G Revathy & R Sathya

Authors Information

Online First: November 21, 2022

Cite this article

Abstract

Sound Event Recognition(SER) and Acoustic Scene Recognition(ASR) tasks are gaining more importance due to its applications in personal and public security. Some of the factors complicating the SER and ASR tasks are the quality of audio recording devices, the number of audio sources in a particular environment, and overlapping sound and scene classes. Hence there is a demand to extract different kinds of information from audio to learn a more robust representation of sound events and acoustic scenes. This can be achieved by representing sound in multiple forms to utilize complementary information present in sound data. In this paper, we propose a Bag-of-Audio-Visual Words (BoAVW) approach for the sound event and acoustic scene recognition tasks. The proposed approach constructs Bag-of-Audio words from Mel Frequency Cepstral Coefficient (MFCC) features and Bag-of-Visual words from Scale-Invariant Feature Transform (SIFT), Speeded Up Robust Features (SURF), and Moments-based visual features extracted from auditory images. The Support Vector Machine (SVM) classifier is used to recognize these representations as sound events and acoustic scenes. The proposed BoAVW approach shows improved results when trained on benchmark datasets such as ESC-50 (sound events), DCASE-2016 (sound events), and DCASE-2017 (acoustic scenes). The proposed approach gives 66.6%, 93.2% and 82.58% accuracy respectively when compared with few recent state-of-the-art methods.

Keywords

Sound Event Recognition(SER), Acoustic Scene Recognition(ASR), Bag-of-Audio-Visual Words(BoAVW),Mel-Frequency Cepstral Coefficients (MFCCs),Auditory image, Spectrogram, Scale Invariant Feature Transform(SIFT), Speeded Up Robust Features(SURF).

BAG-OF-AUDIO-VISUAL WORDS BASED APPROACH FOR SOUND EVENT AND ACOUSTIC SCENE RECOGNITION TASKS FOR INDUSTRIAL MACHINERIES

Keywords

Ann. For. Res. Vol 65, No 1 (2022) Pages 4225-4241

Ann. For. Res.
Vol 65, No 1 (2022)
Pages 4225-4241