Laboratory of Computational Biology at VIB KU Leuven recently published an article in Nature on the identification of DNA fragments that are active in a certain type of cells in the brain. The VSC infrastructure was crucial factor in this excellent research.
The fly is one of the key model organisms in biological research, lying at a sweet spot of complexity and understandability. Indeed, a fly exhibits a large amount of complex behavior and is even able to learn. To find out how the different cell types of the brain are encoded in the genome, the lab of computational biology at VIB KU Leuven from Stein Aerts set out to measure genome accessibility in the whole fly brain. The experiments were carried out on 240,000 cells across brain development, identifying 96,000 DNA regions involved in creating the brain.
The next challenges were mostly computational since this large matrix of 23 billion elements was very sparse (DNA is only present in two copies per cell, leading to 0,1 or 2 as measurement) and takes a large space of memory in occupancy. Therefore, the computational power provided by the compute nodes in the VSC Genius cluster was used to apply text-mining techniques to identify patterns and cluster cells in different cell types.
But, a large problem was still present, called the futility theorem where the grand majority of predicted transcription factor binding sites in the genome is non-functional. To circumvent this issue, deep learning was applied on the sequences. Convolutional neural networks were trained that are able to assign DNA fragments to cell types, using only sequence information, pinpointing the exact nucleotides that are important in DNA accessibility and activity. When fragments selected by this model, called DeepFlyBrain, were cloned back to the fly, the targeted cell types indeed lighted up. And when the important nucleotides were mutated, the expression was lost. Thus, the futility theorem was overcome thanks to the power of the neural networks and the VSC Genius GPU nodes to train them.
Have a look at the full published article on Nature scientific journal via this link.