A Computational Pipeline for Comparative Study of Guide RNAs for CRISPR-CAS Genome Editing Systems

CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats- CRISPR Associated Proteins ) systems were recently developed into genome editing tools. Given their strength in multiplex targeting and  easy programmability, CRISPR-CAS systems are already widely applied in academia and industry. While existing detection approaches could capture CRISPR arrays based on prior knowledge of sequence features, such predicted CRISPR arrays may lack essential structural features to enable RNA-processing for further immunity mechanism.  Moreover,  for efficient design of single guided RNA, unmet needs may remain to comprehensively discover structural features of CAS-nuclease/guide RNA complexes Here, we discuss a two-step seed-and-extend approach using a distributed computing environment, to detect CRISPR arrays. The proposed approach is expected to be a beneficial framework for large-scale comparative analysis of CRISPR-CAS structures across species.