By the same authors

From the same journal

Hadoop neural network for parallel and distributed feature selection

Research output: Contribution to journalArticle

Author(s)

Department/unit(s)

Publication details

JournalNeural Networks
DateAccepted/In press - 2015
DateE-pub ahead of print - 5 Sep 2015
DatePublished (current) - 1 Jun 2016
Issue numberSpecial issue
Volume78
Pages (from-to)24–35
Early online date5/09/15
Original languageEnglish

Abstract

In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop.

    Research areas

  • Hadoop; Hadoop; MapReduce; Distributed; Parallel; Feature selection; Binary neural network

Discover related content

Find related publications, people, projects, datasets and more using interactive charts.

View graph of relations