Microarray dataset

Below you find the microarray datasets used for rule-based sample classification with the BioHEL evolutionary learning system:

Diffuse large B-cell lymphoma dataset [1.5 MB] - Shipp et al. 2002, 7129 genes, 77 samples
Prostate cancer dataset [1.5 MB] - Singh et al. 2002, 2135 genes, 102 samples
Breast cancer dataset [13 MB] - Naderi et al. 2006, 47293 genes, 128 samples

We applied three feature selection algorithms (CFS, RFS, PLSS) to these datasets using two cross validation schemes: 10-fold and leave-one-out (LOO). We obtained the following final datasets (stored in commonly used Weka arff format):

10-fold [587 kB], LOO [4.5 MB] - diffuse large B-cell lymphoma dataset
10-fold [633 kB], LOO [6.3 MB] - prostate cancer dataset
10-fold [300 kB], LOO [3.6 MB] - breast cancer dataset