The main goals of this work is to study and compare machine learning algorithms to predict the development of type 2 diabetes mellitus.
Four classifi cation algorithms have been considered, studying and comparing the accuracy of each one to predict the incidence of type 2 diabetes mellitus seven years in advance. Specifically, the techniques studied are: Decision Tree, Random Forest, kNN (k-Nearest Neighbors) and Neural Networks.
The study not only involves the comparison among these techniques, but also, the tuning of the meta-parameters in each algorithm.
The algorithms have been implemented using the language R.
The data base used is obtained from the nation-wide cohort di@bet.es study.
The conclusions will include the accuracy of each algorithm and therefore the best technique for this problem. The best meta-parameters for each algorithm will be also provided.