# k-Disjunctive Normal Form functions

The k-Disjunctive Normal Form (k-DNF) functions are synthetic boolean functions that present different challenges for evolutionary learning in terms of number of attributes, size of the training set, overlapping and noise, among others.

In a space of d variables, the k-DNF functions have a following structure:

T_1 vv T_2 vv ... vv T_r

where r is the number of disjunctive terms and each term Tx represents the conjunction of k out of d variables (x1, x2, ..., xd). In boolean k-DNF functions a value of 0 or 1 is associated to each variable xi. The following is an example of k-DNF function in a space of 10 variables (d), with 3 terms/rules (r) using 4 variables/attributes (k) each.

(x_1 ^^ x_5 ^^ not x_7 ^^ x_8) vv (x_1 ^^ not x_3 ^^ not x_6 ^^ x_7) vv (x_1 ^^ x_2 ^^ not x_6 ^^ x_8)

The difficulty of learning a k-DNF function depends on the class imbalance and rule overlapping. The probability of having a negative example follows the following distribution: ## k-DNF Generator

You can download the C++ source code of a k-DNF problem generator. The generator's input are the d, k, r parameters, the amount of noise and a random seed (to be able to reproduce the same sets). Instead of a complete k-DNF problem, the generator can produce just a random sample, which is useful for problems with a large amount of attributes and instances.

## Datasets

Two different k-DNF datasets have been used in our experiments:

The set includes 990 problems, 10 for each configuration with the parameters: d = , k = [2,3,4,5,6,7,8,9,10], r = [1,5,10,15,20,25,30,35,40,45,50]. These datasets have been used in Franco2010a.

The set includes 1800 problems, 5 for each configuration with the parameters: d = [10,20], k = [2,3,4,5,6,7,8,9,10], r = [1,5,10,20,40].
Each problem contains a certain ratio n of noise, where n = [0, 0.01%, 0.05%, 0.10%].

For any comments or questions about k-DNF problems or datasets in this website please send an e-mail to jaume.bacardit.

## Publications

• DOI slides BibTeX
M.A. Franco, N. Krasnogor, J. Bacardit
Analysing bioHEL using challenging boolean functions
in Proceedings of the 12th annual conference comp on Genetic and evolutionary computation - GECCO '10, p.1855, Portland, Oregon, USA, 2010
@INPROCEEDINGS{Franco2010a,
title = {Analysing bioHEL using challenging boolean functions},
author = {Franco, Maria A. and Krasnogor, Natalio and Bacardit, Jaume},
year = 2010,
doi = {10.1145/1830761.1830817},
booktitle = {Proceedings of the 12th annual conference comp on Genetic and evolutionary computation - GECCO '10},
pages = {1855},
address = {Portland, Oregon, USA}
}