During the drug discovery phase, mapping the molecular profile that is induced by candidate compounds can reveal insights in the compounds' mechanism of action (MOA), or reveal potential toxicity issues. Molecular profiles can also be used to assess compound similarity and make comparisons with established drugs.
Early stages of drug discovery often depend on relatively simple reporter assays or target gene expression readouts. RNA expression profiling technologies like RNA-sequencing can provide a more comprehensive characterization of compounds at the molecular pathway level. This information can complement phenotypic readouts and can be used to further prioritize candidate compounds for further drug development. RNA-expression profiling also serves as a generic test that can be applied to any drug development pipeline without the need for target-dependent customization.
To be useful, such a method needs to be applicable in a high-throughput setting, be cost-effective and still provide sufficient transcriptome coverage to infer pathways. To this end, we developed HTPathwaySeq, an RNA-sequencing workflow that can be applied directly to cell lysates from 96-well culture plates.
We typically require 4 replicates per condition and apply a 3’ end-sequencing library prep workflow with shallow sequencing (1M reads per sample). This results in reproducible detection of around 7000 genes based on which we apply several data analysis workflows to characterize each of the conditions.
Slide 4 demonstrates the technical performance of HTPathwaySeq. The first plot shows the cumulative distribution of the number of detected genes across 384 samples, with a median of 7,000 genes per sample.
You can see from the second plot that the reads are focused at the 3’ end of the genes and the third plot demonstrates that gene expression counts are very reproducible between technical replicates in the workflow.
From the gene expression data, we identify differential pathways using a gene set enrichment algorithm that interrogates over 4000 annotated gene sets, representing both canonical pathways and manually curated gene lists from literature involving various chemical and genetic perturbations.
The GSEA approach was recently established as the best performing approach to detect differential pathways in a comparative study.
We have also run analysis to demonstrate that, with just the 7,000 most abundant genes, differential pathway analysis is not affected. In 3 independent datasets, most of the gene sets that are identified when using all expressed genes were also identified when using only the 7,000 most abundant genes.
To demonstrate that HTPathwaySeq identifies relevant pathways associated to drug MOA, we analysed compounds with known function. Slide 7 shows an example of a screen in HEPG2 cells involving several compounds of interest and compounds with known MOA. One of these compounds was TSA, a well established HDAC inhibitor. The top 10 activated and repressed pathways and genesets in cells treated with TSA are shown on the right of slide 7. You can clearly appreciate the strong enrichment of gene sets related to TSA action and HDAC inhibition.
HTPathwaySeq can also reveal dose-dependent effects on pathway activity. Slide 8 shows an example of a screen where different compounds of interest were administered at varying doses. You can clearly observe the dose dependent effect on pathway enrichment, both for up- and downregulated pathways. These type of profiles can for instance be used to match with dose-dependent effects on the phenotype and select the most relevant pathways regulated by the compound.
Based on the molecular profile of each individual compound in the screen, you can also define compound similarity. Slide 9 shows an example of a compound similarity matrix, where compounds that induce a similar molecular phenotype cluster together. These results can be correlated to chemical structure or cellular phenotype to better understand differences between compounds and their MOA.
One can also look at individual pathways or gene sets of interest to identify compounds of interest. One application is to look at canonical toxicity pathways like DNA damage of various stress response pathways for instance and get insights in potential toxicity of compounds.