Simulated annealing for supervised gene selection

Maurizio Filippone, Francesco Masulli*, Stefano Rovetta

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Genomic data, and more generally biomedical data, are often characterized by high dimensionality. An input selection procedure can attain the two objectives of highlighting the relevant variables (genes) and possibly improving classification results. In this paper, we propose a wrapper approach to gene selection in classification of gene expression data using simulated annealing along with supervised classification. The proposed approach can perform global combinatorial searches through the space of all possible input subsets, can handle cases with numerical, categorical or mixed inputs, and is able to find (sub-)optimal subsets of inputs giving low classification errors. The method has been tested on publicly available bioinformatics data sets using support vector machines and on a mixed type data set using classification trees. We also propose some heuristics able to speed up the convergence. The experimental results highlight the ability of the method to select minimal sets of relevant features.

Original languageEnglish (US)
Pages (from-to)1471-1482
Number of pages12
JournalSoft Computing
Volume15
Issue number8
DOIs
StatePublished - Aug 2011

Keywords

  • Classification trees
  • DNA microarrays
  • Gene selection
  • Input selection
  • Simulated annealing
  • Support vector machines

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Geometry and Topology

Fingerprint

Dive into the research topics of 'Simulated annealing for supervised gene selection'. Together they form a unique fingerprint.

Cite this