Efficient Deep Learning Model for DNA Forensic Investigations
DOI:
https://doi.org/10.31987/ijict.8.1.246Keywords:
DNA, STR, Deep learning, Regularization techniqueAbstract
Recent advances in genetics have increased the sensitivity and reliability of the forensic sciences. Therefore, there is a need for efficient forensic investigation techniques. Deep learning is becoming increasingly important in forensic science as it offers the potential to increase the accuracy and effectiveness of various forensic tasks like paternity testing, missing person identification, and potentially connecting suspects to crime sites. Although it’s performance at solving many problems, the deep learning model may suffer from overfitting problems that occur when it fails to generalize well and instead fits more precisely to the training dataset. This work presents two deep learning models, Deep Neural Network (D-DNN) and Gated Recurrent Neural Network (D-GRU), for human identification based on Deoxyribonucleic Acid-Short Tandem Repeat (DNA-STR) as the input sequence. These models are built and tested in such a way as the best performance since two regularization techniques are introduced: dropout and data augmentation to avoid overfitting problems. Two datasets are used: one with a size of 53530 and another with a size of 151580. Whereas 80% for training purposes of the first dataset is equal to 42824, while it is equal to 121264 in the second dataset. A comparison of performance was held between these two models by using dropout or not. The results show that D-GRU using dropout has the best performance with overfitting invisibility, training and testing accuracy equal to 1.0, the loss equal to 1.5276 ×10−8, and the validation loss equal to 6.1561 ×10−6.