Data analysis

Biogazelle offers comprehensive analysis of your data, using advanced analysis methods, tailored to your needs. These analyses will accelerate your research, turning data into knowledge.

Learn what we can do for you

Automated processing and quality control of RNA sequencing data with Cobra

Processing of RNA sequencing reads is done by Cobra, Biogazelle’s in-house developed cloud-based pipeline for RNA seq data management and analysis. Cobra processes both small RNA as well as messenger RNA and long non-coding RNA sequencing data with state-of-the-art tools. Biogazelle’s data processing pipeline is composed of 3 main steps: data preparation, read mapping and RNA quantification. Small RNA data processing is built on proprietary code based on a long standing expertise in the field as supported by our key publications in miRNA research. Cobra incorporates multiple quality control steps along the data processing pipeline. Data is quickly processed (more than 100 samples a day), while providing data security (against data loss and security breaches).

Key aspects of Cobra
  • Strong focus on quality: data quality is controlled at multiple stages. Implemented pipelines are validated according to ISO standards
  • Security: controlled access to data, customer data backup
  • Scalability: high performance RNA-seq data processing
  • Traceability: Detailed record keeping of performed tasks enables traceability months/years after project completion
Our non-coding expertise is embedded in Cobra
  • Processing of small RNAs and miRNAs is built on proprietary small RNA quantification tool with tailored miRNA QC plots.
  • For quantification of long non-coding RNAs we rely on a reference transcriptome enriched with more than 20,000 lncRNA gene models from LNCipedia. This lncRNA enriched reference annotation results in a significantly higher detection of lncRNAs and a better quantification of lncRNAs.

Long RNA sequencing reads are annotated using Biogazelle’s in-house established reference transcriptome. This reference transcriptome is built on the latest Ensembl gene catalogue and extended with the latest release of LNCipedia. Biogazelle’s reference transcriptome translates in an enriched gene reference set comprising about 80,000 genes, of which more than 20,000 lncRNA genes are contributed by LNCipedia and therefore enables a very rich transcript annotation of long non-coding RNA genes.

Advanced gene expression analysis at Biogazelle

Biogazelle data analysis capabilities can support you in differential gene expression analysis, time series analysis for patient monitoring, response to treatment, among other applications. We implement state-of-the-art tools for pathway analysis of protein coding genes, miRNAs and lncRNAs of interest. As such, you gain insights into the underlying biology of differentially expressed genes, as it reduces complexity and has increased explanatory power.

A multidisciplinary and experienced team with PhD-level data analysis experts, is able to support dedicated and custom analyses, such as patient subtyping, tissue specific gene expression, etc.

Variant analysis using RNA sequencing data

RNA sequencing data provides an excellent entry to go beyond classic RNA abundance analysis and simultaneously exploit the structural information encoded in the transcriptome, such as mutations and fusion genes in cancer cells. Expressed variants are enriched in drivers of malignancy and may result in the formation of neoantigens. Both mutation burden and number of neoantigens are correlated to patients’ response to immunotherapy. Fusion transcripts are also common drivers in cancer, making them ideal targets for diagnostic and therapeutic purposes.

Biogazelle has developed an RNA sequencing variant analysis pipeline, allowing the detection of single nucleotide variants, small insertions and deletions, and fusion genes. Candidate variants are queried using external resources providing genomic and functional annotation.

Dedicated resources to study non-coding RNA

Our data analysis pipelines include access to dedicated long non-coding RNA and small RNA functional annotation databases, such as decodeRNA and our proprietary LNCarta.

LNCarta is Biogazelle’s proprietary database of predicted lncRNA functions, established through high-throughput perturbation by chemical compounds and silencing of transcription factors. LNCarta assists in:

  1. mapping lncRNAs onto pathways
  2. providing functional context for lncRNAs
  3. identifying upstream regulators of lncRNAs.

decodeRNA is a database providing functional context for human long non-coding RNAs (lncRNAs) and miRNAs based on the guilt-by-association principle. Genes that share a similar expression profile are more likely to act in the same pathway or respond to the same upstream regulators. Based on this principle, we can apply data from the annotated part of the transcriptome (i.e; the protein coding mRNAs) to derive pathways associated with the unannotated part of the transcriptome (the non-coding RNAs). Several studies have shown that the guilt-by-association principle can be applied to derive functions associated to both miRNAs and long non-coding RNAs. Outside the public database, we can use the same guilt-by-association principle to predict functions on your gene expression data using gene set enrichment methods.

Biomarker development

The inclusion biomarkers in clinical trials results in a marked increase in the likelihood to obtain approval for the new drug. The benefit of such biomarkers in nicely reflected in the steady increase of biomarker usage in clinical trials with 28% of novel new drugs (NNDs) approved by FDA being precision (personalized) medicines in 2015.

Model building for biomarker discovery is performed in partnership with DNAlytics, an expert in the field. DNAlytics offers a tailored and on-demand consultancy service in data mining for pharmaceutical, biotechnology and IVD companies, as well as academic medical, biological or clinical research laboratories.

Quantitative PCR data analysis

qbase+ is our commercially available desktop software for qPCR data analysis on Mac and Windows computers. The user-friendly software is based on peer-reviewed quantification models for PCR efficiency correction, error propagation, inter-run calibration and statistics. Advanced normalization methods and an improved geNorm algorithm for selection of stably expressed reference genes are built into the software.


Digital PCR data analysis

In digital PCR, we go beyond classic assay validation when testing linearity and accuracy (read more). Further, we use generalized mixed linear models for advanced digital PCR data analysis and reliable error propagation, e.g. when using multiple replicates and reference genes (read more).

Discover our services

Subscribe to email updates from Biogazelle