Genomic-scale analysis of DNA words of arbitrary length by parallel computation.

Yang, X. Y.; Ripoll, A.; Arnau Llombart, Vicente; Marín Lozano, Ignacio; Luque, E.
Aquest document és un/a article, creat/da en: 2006

In the post-genomic era, one of the main tasks is deciphering the meaning of the DNA sequences of complex organisms. In order to do so, there is a clear need for biocomputer tools able to extract and order the information of long DNA molecules, such as whole chromosomes or even complete genomes. However, most genomic analyses have been concentrated on the detection and counting of short words having sizes of between 1 and 10 nucleotides. In this paper, we describe parallel algorithms with different complexities that exhaustively determine all words of size k, k being arbitrarily large, in a source DNA sequence. The results shown that our algorithms achieve a high degree of scalability, allowing the detection of DNA words of 64 nucleotides in only 800 seconds.
