A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction

Published on Feb. 17, 2014

The necessity for reliable ab initio protein secondary structure prediction is growing along with the demand for accurate tertiary structure prediction. Although recent developments have slightly exceeded previous methods of secondary structure prediction, these methods rarely surpass 80% accuracy. Developing new tools and methods to improve secondary structure prediction is essential to the improvement of tertiary structure prediction in proteins.

Here we present DNSS, a secondary structure predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures. Graphical processing units and CUDA software are used to optimize the deep network architecture and efficiently train the deep networks. Numerous input profiles were tested, including different ranges of consecutive residues and the information to include for each residue, and the architecture of the deep learning network itself was also varied. Once optimal parameters for the training process were determined, a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict the secondary structure of a fully independent test data set of 198 proteins, achieving a Q₃ accuracy of 80.7%.