Selected Publications

At Biogazelle we don't just provide industry-leading services to our customers. True to our academic origins, our staff frequently publishes in peer-reviewed journals making contributions to both biomarker knowledge but also method development. Below you will find a non-exhaustive list of recent articles authored or co-authored by Biogazelle staff.

RNA profiling has emerged as a powerful tool to investigate the biomarker potential of human biofluids. However, despite enormous interest in extracellular nucleic acids, RNA sequencing methods to quantify the total RNA content outside cells are rare. Here, we evaluate the performance of the SMARTer Stranded Total RNA-Seq method in human platelet-rich plasma, platelet-free plasma, urine, conditioned medium, and extracellular vesicles (EVs) from these biofluids. We found the method to be accurate, precise, compatible with low-input volumes and able to quantify a few thousand genes. We picked up distinct classes of RNA molecules, including mRNA, lncRNA, circRNA, miscRNA and pseudogenes. Notably, the read distribution and gene content drastically differ among biofluids. In conclusion, we are the first to show that the SMARTer method can be used for unbiased unraveling of the complete transcriptome of a wide range of biofluids and their extracellular vesicles.

STUDY QUESTION:Can plasma miRNAs be used for the non-invasive diagnosis of endometriosis in infertile women?

SUMMARY ANSWER:miRNA-based diagnostic models for endometriosis failed the test of independent validation.

WHAT IS KNOWN ALREADY:Circulating miRNAs have been described to be differentially expressed in patients with endometriosiscompared with women without endometriosis, suggesting that they could be used for the non-invasive diagnosis of endometriosis. However, these studies have shown limited consistency or conflicting results, and no miRNA-based diagnostic test has been validated in an independent patient cohort.

STUDY DESIGN, SIZE, DURATION:We performed genome-wide miRNA expression profiling by small RNA sequencing to identify a set of plasma miRNAs with discriminative potential between patients with and without endometriosis. Expression of this set of miRNAs was confirmed by RT-qPCR. Diagnostic models were built using multivariate logistic regression with stepwise feature selection. In a final step, the models were tested for validation in an independent patient cohort.

PARTICIPANTS/MATERIALS, SETTINGS, METHODS:Plasma of all patients was available in the biobank of the Leuven Endometriosis Centre of Excellence. Biomarker discovery and model development were performed in a discovery cohort of 120 patients (controls = 38, endometriosis = 82), and models were tested for validation in an independent cohort of 90 patients (controls = 30, endometriosis = 60). RNA was extracted with the miRNeasy Plasma Kit. Genome-wide miRNA expression analysis was done by small RNA sequencing using the NEBNext small RNA library prep kit and the NextSeq 500 System. cDNA synthesis and qPCR were performed using the Qiagen miScript technology.

MAIN RESULTS AND THE ROLE OF CHANCE:We identified a set of 42 miRNAs with discriminative power between patients with and without endometriosis based on genome-wide miRNA expression profiling. Expression of 41 miRNAs was confirmed by RT-qPCR, and 3 diagnostic models were built. Only the model for minimal-mild endometriosis (Model 2: hsa-miR-125b-5p, hsa-miR-28-5p and hsa-miR-29a-3p) had diagnostic power above chance performance in the independent validation (AUC = 60%) with an acceptable sensitivity (78%) but poor specificity (37%).

LIMITATIONS, REASONS FOR CAUTION:The diagnostic models were built and tested for validation in two patient cohorts from a single tertiary endometriosis centre. Further validation tests in large cohorts with patients from multiple endometriosis centres are needed.

WIDER IMPLICATION OF THE FINDINGS:Our study supports a possible biological link between certain miRNAs and endometriosis, but the potential of these miRNAs as clinically useful biomarkers is questionable in women with infertility. Large studies in well-described patient cohorts, with rigorous methodology for miRNA expression analysis, sufficient statistical power and an independent validation step, are necessary to answer the question of whether miRNAs can be used as diagnostics markers for endometriosis.

Cardiovascular disease (CVD) remains the leading cause of death worldwide and, despite continuous advances, better diagnostic and prognostic tools, as well as therapy, are needed. The human transcriptome, which is the set of all RNA produced in a cell, is much more complex than previously thought and the lack of dialogue between researchers and industrials and consensus on guidelines to generate data make it harder to compare and reproduce results. This European Cooperation in Science and Technology (COST) Action aims to accelerate the understanding of transcriptomics in CVD and further the translation of experimental data into usable applications to improve personalized medicine in this field by creating an interdisciplinary network. It aims to provide opportunities for collaboration between stakeholders from complementary backgrounds, allowing the functions of different RNAs and their interactions to be more rapidly deciphered in the cardiovascular context for translation into the clinic, thus fostering personalized medicine and meeting a current public health challenge. Thus, this Action will advance studies on cardiovascular transcriptomics, generate innovative projects, and consolidate the leadership of European research groups in the field.COST (European Cooperation in Science and Technology) is a funding organization for research and innovation networks (

For a wide range of diseases, SNPs in the genome are the underlying mechanism of dysfunction. Therefore, targeted detection of these variations is of high importance for early diagnosis and (familial) screenings. While allele-specific PCR has been around for many years, its adoption for SNP genotyping or somatic mutation detection has been hampered by its low discriminating power and high costs. To tackle this, we developed a cost-effective qPCR based method, able to detect SNPs in a robust and specific manner. This study describes how to combine the basic principles of allele-specific PCR (the combination of a wild type and variant primer) with the straightforward readout of DNA-binding dye based qPCR technology. To enhance the robustness and discriminating power, an artificial mismatch in the allele-specific primer was introduced. The resulting method, called double-mismatch allele-specific qPCR (DMAS-qPCR), was successfully validated using 12 SNPs and 15 clinically relevant somatic mutations on 48 cancer cell lines. It is easy to use, does not require labeled probes and is characterized by high analytical sensitivity and specificity. DMAS-qPCR comes with a complimentary online assay design tool, available for the whole scientific community, enabling researchers to design custom assays and implement those as a diagnostic test.

PURPOSE:Patients with oligometastatic prostate cancer (PC) may benefit from metastasis-directed therapy (MDT), delaying disease progression and the start of palliative systemic treatment. However, a significant proportion of oligometastatic PC patients progress to polymetastatic PC within a year following MDT, suggesting an underestimation of the metastatic load by current staging modalities. Molecular markers could help to identify true oligometastatic patients eligible for MDT.

METHODS:Patients with asymptomatic biochemical recurrence following primary PC treatment were classified as oligo- or polymetastatic based on 18F-choline PET/CT imaging. Oligometastatic patients had up to three metastases at baseline and did not progress to more than three lesions following MDT or surveillance within 1 year of diagnosis of metastases. Polymetastatic patients had > 3 metastases at baseline or developed > 3 metastases within 1 year following imaging. A model aiming to prospectively distinguish oligo- and polymetastatic PC patients was trained using clinicopathological parameters and serum-derived microRNA expression profiles from a discovery cohort of 20 oligometastatic and 20 polymetastatic PC patients. To confirm the models predictive performance, it was applied on biomarker data obtained from an independent validation cohort of 44 patients with oligometastatic and 39 patients with polymetastatic disease.

RESULTS:Oligometastatic PC patients had a more favorable prognosis compared to polymetastatic ones, as defined by a significantly longer median CRPC-free survival (not reached versus 38 months; 95% confidence interval 31-45 months with P < 0.001). Despite the good performance of a predictive model trained on the discovery cohort, with an AUC of 0.833 (0.693-0.973; 95% CI) and a sensitivity of 0.894 (0.714-1.000; 95% CI) for oligometastatic disease, none of the miRNA targets were found to be differentially expressed between oligo- and polymetastatic PC patients in the signature validation cohort. The multivariate model had an AUC of 0.393 (0.534 after cross-validation) and therefore, no predictive ability.

CONCLUSIONS:Although PC patients with oligometastatic disease had a more favorable prognosis, no serum-derived biomarkers allowing for prospective discrimination of oligo- and polymetastatic prostate cancer patients could be identified.

On determining the power of digital PCR experiments.

Vynck M et al. Anal Bioanal Chem. 2018 Sep;410(23)

Read the full article

The experimental design that will be carried out to evaluate a nucleic acid quantification hypothesis determines the cost and feasibility of digital polymerase chain reaction (digital PCR) studies. Experiment design involves the calculation of the number of technical measurement replicates and the determination of the characteristics of those replicates, and this in accordance with the capabilities of the available digital PCR platform. Available digital PCR power analyses suffer from one or more of the following limitations: narrow scope, unrealistic assumptions, no sufficient detail for replication, lack of source code and user-friendly software. Here, we discuss the nature of six parameters that affect the statistical power, i.e., desired effect size, total number of partitions, fraction of positive partitions, number of replicate measurements, between-replicate variance, and significance level. We also show to what extent these parameters affect power, and argue that careful design of experiments is needed to achieve the desired power. A web tool, dPowerCalcR, that allows interactive calculation of statistical power and optimization of the experimental design is available.

Human endogenous retroviruses (HERVs), remnants of ancestral viral genomic insertions, are known to represent 8% of the human genome and are associated with several pathologies. In particular, the envelope protein of HERV-W family (HERV-W-Env) has been involved in multiple sclerosis pathogenesis. Investigations to detect HERV-W-Env in a few other autoimmune diseases were negative, except in type-1 diabetes (T1D). In patients suffering from T1D, HERV-W-Env protein was detected in 70% of sera, and its corresponding RNA was detected in 57% of peripheral blood mononuclear cells. While studies on human Langerhans islets evidenced the inhibition of insulin secretion by HERV-W-Env, this endogenous protein was found to be expressed by acinar cells in 75% of human T1D pancreata. An extensive immunohistological analysis further revealed a significant correlation between HERV-W-Env expression and macrophage infiltrates in the exocrine part of human pancreata. Such findings were corroborated by in vivo studies on transgenic mice expressing HERV-W-env gene, which displayed hyperglycemia and decreased levels of insulin, along with immune cell infiltrates in their pancreas. Altogether, these results strongly suggest an involvement of HERV-W-Env in T1D pathogenesis. They also provide potentially novel therapeutic perspectives, since unveiling a pathogenic target in T1D.

BACKGROUND:Although the sequencing landscape is rapidly evolving and sequencing costs are continuously decreasing, whole genome sequencing is still too expensive for use on a routine basis. Targeted resequencing of only the regions of interest decreases both costs and the complexity of the downstream data-analysis. Various target enrichment strategies are available, but none of them obtain the degree of coverage uniformity, flexibility and specificity of PCR-based enrichment. On the other hand, the biggest limitation of target enrichment by PCR is the need to design large numbers of partially overlapping assays to cover the target.

RESULTS:To overcome the aforementioned hurdles, we have developed primerXL, a state-of-the-art PCR primer design pipeline for targeted resequencing. It uses an optimized design criteria relaxation cascade and a thorough downstream in silico evaluation process to generate high quality singleplex PCR assays, reducing the need for amplicon normalization, and outperforming other target enrichment strategies and similar primer design tools when considering assay quality, coverage uniformity and target coverage. Results of four different sequencing projects with 2348 amplicons in total covering 470 kb are presented. PrimerXL can be accessed at

CONCLUSION:PrimerXL is an state-of-the-art, easy to use primer design webtool capable of generating high-quality targeted resequencing assays. The workflow is fully customizable to suit every researchers' needs, while an innovative relaxation cascade ensures maximal target coverage.

Reverse transcription quantitative polymerase chain reaction (RT-qPCR) is considered as the gold standard for accurate, sensitive, and fast measurement of gene expression. Prior to downstream statistical analysis, RT-qPCR fluorescence amplification curves are summarized into one single value, the quantification cycle (Cq). When RT-qPCR does not reach the limit of detection, the Cq is labeled as "undetermined". Current state of the art qPCR data analysis pipelines acknowledge the importance of normalization for removing non-biological sample to sample variation in the Cq values. However, their strategies for handling undetermined Cq values are very ad hoc. We show that popular methods for handling undetermined values can have a severe impact on the downstream differential expression analysis. They introduce a considerable bias and suffer from a lower precision. We propose a novel method that unites preprocessing and differential expression analysis in a single statistical model that provides a rigorous way for handling undetermined Cq values. We compare our method with existing approaches in a simulation study and on published microRNA and mRNA gene expression datasets. We show that our method outperforms traditional RT-qPCR differential expression analysis pipelines in the presence of undetermined values, both in terms of accuracy and precision.

Quality control of digital PCR assays and platforms.

Vynck M et al. Anal Bioanal Chem. 2017 Oct;409(25)

Read the full article

Digital polymerase chain reaction (digital PCR, dPCR) is a direct nucleic acid quantification method, thus requiring no standard curves unlike quantitative real-time PCR (qPCR). Nevertheless, evaluation of the linear dynamic range, accuracy, and precision of an assay or platform is recommended, as there are several potential causes of important non-linearity, bias, and imprecision. Ignoring these quality issues may lead to erroneous quantification. This necessitates an approach akin to the construction of standard curves. We study the pitfalls associated with the evaluation of such an experiment, and provide guidelines for the assessment of linearity, accuracy, and precision in dPCR experiments. We present simulation results and a case study supporting the importance of a thorough evaluation. Further, typically presented plots and statistics may not reveal problems with linearity, accuracy, or precision. We find that a robust weighted least-squares approach is highly advisable, yet may also suffer from an inflated false-positive rate. The proposed assessments are also applicable to other analyses, such as the comparison of results obtained from qPCR and dPCR. A web tool for quality evaluation, dPCalibRate, is available.

Although the long non-coding RNA (lncRNA) landscape is expanding rapidly, only a small number of lncRNAs have been functionally annotated. Here, we present decodeRNA (, a database providing functional contexts for both human lncRNAs and microRNAs in 29 cancer and 12 normal tissue types. With state-of-the-art data mining and visualization options, easy access to results and a straightforward user interface, decodeRNA aims to be a powerful tool for researchers in the ncRNA field.

AIMS/HYPOTHESIS:Renal fibrosis is a common complication of diabetic nephropathy and is a major cause of end-stage renal disease. Despite the suggested link between renal fibrosis and microRNA (miRNA) dysregulation in diabetic nephropathy, the identification of the specific miRNAs involved is still incomplete. The aim of this study was to investigate miRNA profiles in the diabetic kidney and to identify potential downstream targets implicated in renal fibrosis.

METHODS:miRNA expression profiling was investigated in the kidneys of 8-month-old Zucker diabetic fatty (ZDF) rats during overt nephropathy. Localisation of the most upregulated miRNA was established by in situ hybridisation. The candidate miRNA target was identified by in silico analysis and its expression documented in the diabetic kidney associated with fibrotic markers. Cultured tubule cells served to assess which of the profibrogenic stimuli acted as a trigger for the overexpressed miRNA, and to investigate underlying epigenetic mechanisms.

RESULTS:In ZDF rats, miR-184 showed the strongest differential upregulation compared with lean rats (18-fold). Tubular localisation of miR-184 was associated with reduced expression of lipid phosphate phosphatase 3 (LPP3) and collagen accumulation. Transfection of NRK-52E cells with miR-184 mimic reduced LPP3, promoting a profibrotic phenotype. Albumin was a major trigger of miR-184 expression. Anti-miR-184 counteracted albumin-induced LPP3 downregulation and overexpression of plasminogen activator inhibitor-1. In ZDF rats, ACE-inhibitor treatment limited albuminuria and reduced miR-184, with tubular LPP3 preservation and tubulointerstitial fibrosis amelioration. Albumin-induced miR-184 expression in tubule cells was epigenetically regulated through DNA demethylation and histone lysine acetylation and was accompanied by binding of NF-κB p65 subunit to miR-184 promoter.

CONCLUSIONS/INTERPRETATION:These results suggest that miR-184 may act as a downstream effector of albuminuria through LPP3 to promote tubulointerstitial fibrosis, and offer the rationale to investigate whether targeting miR-184 in association with albuminuria-lowering drugs may be a new strategy to achieve fully anti-fibrotic effects in diabetic nephropathy.

RNA-sequencing has become the gold standard for whole-transcriptome gene expression quantification. Multiple algorithms have been developed to derive gene counts from sequencing reads. While a number of benchmarking studies have been conducted, the question remains how individual methods perform at accurately quantifying gene expression levels from RNA-sequencing reads. We performed an independent benchmarking study using RNA-sequencing data from the well established MAQCA and MAQCB reference samples. RNA-sequencing reads were processed using five workflows (Tophat-HTSeq, Tophat-Cufflinks, STAR-HTSeq, Kallisto and Salmon) and resulting gene expression measurements were compared to expression data generated by wet-lab validated qPCR assays for all protein coding genes. All methods showed high gene expression correlations with qPCR data. When comparing gene expression fold changes between MAQCA and MAQCB samples, about 85% of the genes showed consistent results between RNA-sequencing and qPCR data. Of note, each method revealed a small but specific gene set with inconsistent expression measurements. A significant proportion of these method-specific inconsistent genes were reproducibly identified in independent datasets. These genes were typically smaller, had fewer exons, and were lower expressed compared to genes with consistent expression measurements. We propose that careful validation is warranted when evaluating RNA-seq based expression profiles for this specific gene set.

Standard data analysis pipelines for digital PCR estimate the concentration of a target nucleic acid by digitizing the end-point fluorescence of the parallel micro-PCR reactions, using an automated hard threshold. While it is known that misclassification has a major impact on the concentration estimate and substantially reduces accuracy, the uncertainty of this classification is typically ignored. We introduce a model-based clustering method to estimate the probability that the target is present (absent) in a partition conditional on its observed fluorescence and the distributional shape in no-template control samples. This methodology acknowledges the inherent uncertainty of the classification and provides a natural measure of precision, both at individual partition level and at the level of the global concentration. We illustrate our method on genetically modified organism, inhibition, dynamic range, and mutation detection experiments. We show that our method provides concentration estimates of similar accuracy or better than the current standard, along with a more realistic measure of precision. The individual partition probabilities and diagnostic density plots further allow for some quality control. An R implementation of our method, called Umbrella, is available, providing a more objective and automated data analysis procedure for absolute dPCR quantification.

Subscribe to email updates