Open pipelines for integrated tumor genome profiles reveal differences between pancreatic cancer tumors and cell lines.
We describe open, reproducible pipelines that create an integrated genomic profile of a cancer and use the profile to find mutations associated with disease and potentially useful drugs. These pipelines analyze high-throughput cancer exome and transcriptome sequence data together with public databases to find relevant mutations and drugs. The three pipelines that we have developed are: (1) an exome analysis pipeline, which uses whole or targeted tumor exome sequence data to produce a list of putative variants (no matched normal data are needed); (2) a transcriptome analysis pipeline that processes whole tumor transcriptome sequence (RNA-seq) data to compute gene expression and find potential gene fusions; and (3) an integrated variant analysis pipeline that uses the tumor variants from the exome pipeline and tumor gene expression from the transcriptome pipeline to identify deleterious and druggable mutations in all genes and in highly expressed genes. These pipelines are integrated into the popular Web platform Galaxy at http://usegalaxy.org/cancer to make them accessible and reproducible, thereby providing an approach for doing standardized, distributed analyses in clinical studies. We have used our pipeline to identify similarities and differences between pancreatic adenocarcinoma cancer cell lines and primary tumors.