Music Genre Classification using Masked Conditional Neural Networks

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The ConditionaL Neural Networks (CLNN) and the Masked ConditionaL Neural Networks (MCLNN) exploit the nature of multi-dimensional temporal signals. The CLNN captures the conditional temporal influence between the frames in a window and the mask in the MCLNN enforces a systematic sparseness that follows a filterbank-like pattern over the network links. The mask induces the network to learn about time-frequency representations in bands, allowing the network to sustain frequency shifts. Additionally, the mask in the MCLNN automates the exploration of a range of feature combinations, usually done through an exhaustive manual search. We have evaluated the MCLNN performance using the Ballroom and Homburg datasets of music genres. MCLNN has achieved accuracies that are competitive to state-of-the-art handcrafted attempts in addition to models based on Convolutional Neural Networks.
Original languageEnglish
Title of host publicationNeural Information Processing
Subtitle of host publication24th International Conference, ICONIP 2017, Guangzhou, China, November 14-18, 2017, Proceedings, Part II
PublisherSpringer
ISBN (Print)978-3-319-70095-3
DOIs
Publication statusPublished - 18 Feb 2018
EventInternational Conference on Neural Information Processing - Guangzhou, China
Duration: 14 Nov 201718 Nov 2017

Publication series

NameLecture Notes in Computer Science

Conference

ConferenceInternational Conference on Neural Information Processing
Abbreviated titleICONIP
Country/TerritoryChina
CityGuangzhou
Period14/11/1718/11/17

Keywords

  • MCLNN
  • CLNN
  • RBM

Cite this