BiGen: Bioinformatics and Applied Genomics Unit

The Bioinformatics and Applied Genomics (BiGen) Unit was recently founded (2019) in HPI having as a primary mission the development of molecular, genomics and bioinformatics applications relevant to sequencing technologies of all generations, but also the development of software and pipelines for the analysis of large-scale datasets (big-data). Serving as a Core Facility, BiGen supports all research groups within HPI providing sequencing services, including the design and the execution of wet-lab assay preparation, the sequencing analysis and the downstream bioinformatics. BiGen Unit also provides public services, both in academia and the pharmaceutical industry.
TECHNOLOGY
NGS Library preparations
BiGen runs both ready-to-use and custom NGS assays:
• Total DNA and RNA extraction and purification for NGS applications
• Ribodepletion / poly-A enrichment / globin removal / DNase treatment
• Whole Genome Sequencing (WGS) library preparation for viruses, bacteria, fungi, parasites, animals
• Whole Exome Sequencing (WES) library preparation, both target-enrichment- and amplicon-based (human)
• Custom Target Enrichment by hybridisation for regions of interest, from any WGS library preparation
• Targeted gene panels (Comprehensive Cancer, Pharmacogenomics, Inherited disease etc) library preparation
• Whole transcriptome, 3’quantseq and single cell RNAseq library preparation
• Whole genome bisulfite sequencing (WGBS) library preparation
• Assay for Transposase-Accessible Chromatin (ATAC-seq) and single cell ATAC-seq library preparation
• Chromatin immunoprecipitation sequencing (ChIP-seq) library preparation
Sequencing technologies
• 1st generation – Sanger sequencing on a compact SeqStudio genetic analyser
• 2nd generation – NGS on Ion Torrent (Thermo) and Illumina Platforms with single-end or paired-end short read sequencing
• 3rd generation – NGS on MinION (Oxford Nanopore Technologies) with single-end 1-D or 2-D ultra-long reads both on standard and low throughput flowcells
Bioinformatics pipelines in place
The BiGen group constantly updates the available bioinformatics options, with extensive benchmarking and performance assessments of new tools and utilities. Every analysis step is cross-checked by the use of appropriate internal controls ensuring the validity of the results.
Key available bioinformatics pipelines are listed below:
• Full genome reconstruction, denovo assembly and functional annotation (WGS)
• Whole Genome Single Nucleotides Polymorphisms (SNPs) calling and functional annotation
• Whole Genome large-scale structural variations calling and functional annotation
• Genome Wide Association Studies (GWAS)
• Differential gene expression analysis, Gene Ontologies enrichment analysis and clustering (RNAseq)
• Metagenomics and Metaviromics, microbiome enrichment analysis
• Genome-wide DNA methylation patterns recognition and profiles comparison
• ATAC-seq peak calling, peak differential analysis and annotation, motif enrichment, footprinting, and nucleosome position analysis
• Genome wide double stranded DNA Breaks Labeling In Situ and Sequencing (BLISS)
• Epidemiology: Phylogenetics, phylogenomics, phylodynamics, phylogeography
BiGen is equipped with state-of-the-art sequencing platforms with complimentary characteristics, providing maximum flexibility in designing NGS experiments of any technology and throughput.
The Illumina NextSeq 2000 Sequencing System takes advantage of an integrated cartridge that includes reagents, fluidics, and the waste holder, simplifying library loading and instrument use. It supports mid-to-high-throughput sequencing applications generating from 100 million up to 2.4 billion reads (30-360Gb), while offering flexibility in multiplexing and after-market reagents compatibility. It is suitable for a broad range of methods such as exome sequencing, target enrichment, single-cell profiling, transcriptome sequencing, epigenomics, etc.
The Ion S5 next-generation sequencing system (ThermoFisher Scientific) leverages the speed of semiconductor sequencing to enable the production of high-quality sequencing data in as little as 2.5 hours. The Ion S5 is accompanied by the Ion OneTouch 2 System which performs template amplification and enrichment. The Ion S5 System is simple to use with cartridge-based reagents and offers superior scalability and flexibility to support a broad range of mid-throughput (0.6 – 30Gb) sequencing applications such as targeted gene panel sequencing, pathogens WGS etc.
The MinION Next Generation Sequencer (Oxford Nanopore Technologies) is a low throughput platform but can now generate as much as 30 Gb of DNA sequence data or 7-12 million reads. Ultra-long read lengths are possible (hundreds of kb), making it suitable for structural genomics applications. The MinION streams data in real time so that analysis can be performed during the experiment.
The SeqStudio Genetic Analyzer (Applied Biosystems) is a compact, 4-capillary, fluorescence-based capillary electrophoresis system (Sanger), that utilises an all-in-one reagent cartridge and provides the flexibility to perform both sequencing and fragment analysis in a single run.
The Chromium Controller uses advanced microfluidics to perform single cell partitioning and barcoding in a matter of minutes. The Chromium Controller enables integrated analysis of single cells at massive scale by capturing molecular readouts of cell activity in multiple dimensions, including gene expression, cell surface proteins, immune clonotype, antigen specificity, and chromatin accessibility.
The quality control of the templates and the libraries is performed with a 2100 Bioanalyzer system which is an established automated electrophoresis tool. The 2100 Bioanalyzer instrument, together with the 2100 Expert Software, provide highly precise analytical evaluation of various samples types in many workflows. Starting from minimal sample volumes, digital data is provided in a timely manner and delivers objective assessment of sizing, quantitation, integrity and purity from DNA, RNA, and proteins.
Quantitative assessment of the the libraries is also performed with the Qubit 4 Fluorometer (Invitrogen). Qubit 4 is designed to accurately measure DNA, RNA, and protein quantity. Qubit 4 also easily measures RNA integrity and quality.
BiGen has a dedicated bioinformatics office with 4 high performance Linux workstations as terminals and a central server (2x Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz, 128 GB RAM, 2x4TB(raw) HDD in RAID, 1x 200GB(raw) SSD) with updated bioinformatics software and pipelines already in place. The HPI server is accessible remotely and is administrated and serviced by experienced IT personnel who are responsible for their constant availability. On-demand software can be rapidly installed and developed pipelines can seamlessly run, without restriction in the allocation of computing resources.