DrugRepoBank

HELP PAGE

Introduction of DrugRepoBank

Why DrugRepoBank?

The development of new drugs for diseases is a time-consuming, costly, and risky process. In recent years, drug repositioning has the potential to be employed over the traditional drug discovery process, allowing to effectively reduce development costs, time, and ultimately saving patients' lives.

The proposed database, DrugRepoBank, now comprises the most extensive collection of literature-supported drug repositioning data (2000/1/1 - 2023/7/1) and accurately predicted drug repositioning candidates computed through 18 efficient algorithms, including similarity-based, signature-based, network-based, and artificial intelligence-based methods based on heterogeneous drug repositioning information collected from several common databases, complemented by an interactive and user-friendly web interface that not only makes the database accessible to access, search, filter and export, but also provides comprehensive functionalities including network visualization of drug-target-disease-side effect network, pathway analysis of disease signatures and drug signatures and so on.

Users of DrugRepoBank

This database can help users including but not limited to:

(1)Pharmacologists. The database can help pharmacologists shorten the preliminary research work for drug repositioning by providing accurate drug candidates, thereby accelerating the process of research and development.

(2)Biologists. The database can help biologists discover new mechanisms of action of drugs, such as new targets and new therapies, thereby accelerating the research on the mechanism of drug action.

(3)Computational biologists. The literature-supported drug repositioning data with experiment evidence can provide computational biologists with high-quality raw data and help them develop more efficient drug repositioning algorithms. Moreover, the database contains text related to drug repositioning, which can be used as the original data for text mining or Natural Language Processing (NLP) for extracting drug repositioning information.

Prediction

"Prediction" module is aims to achieve the goal of drug repositioning by identifying new relationships between drugs and diseases or drugs and targets through multiple algorithms (similarity-based, artificial intelligence-based, signature-based, and network-based methods) based on three search engines (Drug Search, Target Search, and Disease Search).

(1)Drug search: Drug Search not only provides basic information about drugs and known associations with targets and diseases, but also offers information on new targets and diseases for the drug through drug-drug similarity-based methods (Figure 1A left), target-target similarity-based method (Figure 1A middle), and AI-based methods (Figure 1A right).

(2)Target search: It not only offers basic information about targets and known associations with drugs and diseases, but also provides predictions of new drugs binding with the target to treat new diseases through drug-drug similarity-based methods (Figure 1B left), target-target similarity-based method (Figure 1B middle), and AI-based methods (Figure 1B right).

(3)Disease search: It not only provides differential gene expression and functional analysis information for disease signatures but can also discover disease-drug associations through signature-based (Figure 1C up) and network-based methods (Figure 1C down) to facilitate drug repositioning.

Figure 1. An overview of the algorithm types used by the three search engines in DrugRepoBank's Prediction Module.

Drug Search

In the field of "Drug Search", users can find drug information by searching drug name, DrugBank ID, Pubchem Compound ID, and synonyms of a drug.

Example: if you want to search "Bivalirudin", you need to enter "Bivalirufin", "DB00006" or "16129704" in the search box (①) and then click the "Search" button (②).

Drug Basic Information

You can browse the basic information about the queried drug, including drug structure, drug type, identifiers, and so on. There are "Target Information", "Disease Information" and "Side Effect Information" tables at the bottom.

In the "Target Information" part:

① Click the "Target search" to view more information about the target in DrugRepoBank.

② "UniprotID" provides the link to the specific page about the selected target of Uniprot database.

③ "Gene" provides target-related genes and links to HGNC database.

④ "Source" provides links to TTD database which contains information about drug-target interactions.

Drug-Drug Similarity Table

The "Drug-drug Similarity Prediction" subpage provides a similarity table, including the results of comparing the similarity between the queried drug and other drugs. The similarity comparison is calculated by 6 methods, including structure similarity, target similarity, pathway similarity, and GO similarity (GO similarity calculation from 3 aspects, including GO_cellular component, GO_molecular function, and GO_biological process).

The range of value obtained by each method and "RowMeans" is between 0 and 1. The larger the value, the higher the similarity between drugs and the searched drug.

You can click the drug name like "Dabigatran etexilate" to jump to details page of each drug.

Explanation of the header:

① Structure similarity: Chemical structure similarity between queried drug and other small molecule drugs, which is estimated by Tanimoto coefficient.

② Target similarity: Target similarity means target protein sequence-based similarity. Pairwise target protein sequence comparison is calculated based on the Needleman-Wunsch dynamic programming algorithm and the percentage of pairwise sequence identity. Since a drug may have multiple targets, the "best-match average" is used to calculate the similarity between two drugs based on target sequence similarity.

③ Pathway similarity: Drug-induced pathway similarity is evaluated based on the similarity of constituent genes.

④ GO_CC similarity: The target protein functional similarity between drug pairs is estimated based on the semantic similarity of associated GO cellular component terms of each drug.

⑤ GO_MF similarity: The target protein functional similarity between drug pairs is estimated based on the semantic similarity of associated GO molecular function terms of each drug.

⑥ GO_BP similarity: The target protein functional similarity between drug pairs is estimated based on the semantic similarity of associated GO biological process terms of each drug.

⑦ RowMeans: The average of six types of similarity.

Azad AKM, Dinarvand M, Nematollahi A, Swift J, Lutze-Mann L, Vafaee F. A comprehensive integrated drug similarity resource for in-silico drug repositioning and beyond. Brief Bioinform. 2021 May 20;22(3):bbaa126.[PubMed]

Drug-Target-Disease-Side effect Network

We provide a force-directed network to help researchers obtain more information directly. Different colors mean different aspects of the queried drug. Blue circles represent similar drugs. The center of the queried drug is the biggest. The surrounding circles represent the 20 drugs with the highest "Rowmeans", and the linewidths show thick to thin changes depending on the high to low "RowMeans" in the "Drug-drug Similarity Prediction" table. It means that overall, they are the most similar drugs to the queried drug.

Users can click each drug to jump to the webpage for drug details.

Users can click each target to jump to the webpage for target details.

Drug Details

Drug-Target Interaction Prediction Table

DrugRepoBank integrates 3 AI-based methods (CPI_Prediction, TransformerCPI, and CapBM-DTI) to calculate the prediction scores and status. For each method, if there is an association between the drug and protein, the status will show "1". If there is no association, it will display as "0". The range of prediction score is between 0 and 1. The higher the likelihood, the closer it is to 1. "Status mean" is the mean value of status values of 3 AI-based methods. "Score mean" is the mean value of prediction scores of 3 AI-based methods.

You can click protein name to jump to details page of each target.

Tsubaki M, Tomii K, Sese J. Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics. 2019;35:309-318.[PubMed]

Chen L, Tan X, Wang D et al. TransformerCPI: improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics. 2020;36:4406-4414.[PubMed]

Huang Y, Huang HY, Chen Y, et al. A Robust Drug-Target Interaction Prediction Framework with Capsule Network and Transfer Learning. Int J Mol Sci. 2023;24(18):14061.[PubMed]

Network

DrugRepoBank provides a network to display the association between the drug and proteins. We select the top 10 proteins according to "Score mean". The linewidths show thick to thin changes depending on the high to low "Score mean". We further provide ICD-11 of disease indications of each protein. The queried drug may be used to treat target-related diseases by combining with these targets. Similarly, hovering the mouse over a disease can reveal its name.

Target Search

In the field of "Target Search", users can find target information by searching target name, target gene name, Uniprot ID and synonyms of a target.

Example: if you want to search "EGFR", you can enter "EGFR_HUMAN", "Receptor tyrosine-protein kinase erbB-1", "HER1", "Epidermal growth factor receptor (EGFR)" or "EGFR" in the search box (①), and then click the "Search" button (②).

Due to the possibility of multiple targets having the same gene name, such as "EGFR", you will get a table containing multiple targets when you enter a target gene name. Users can choose the target they want to search for based on their needs.

Target Basic Information

Users can browse the basic information about the queried target.

Select different protein structures by click buttons about PDB ID, like "6B3S".

In the "Drug Information" part:

① Users can click "Name" (only contains basic information of drug) or "Drug search" (contains prediction functionalities of drug) to view more information about the drug.

② "Source" provides links to TTD database which contains information about drug-target interactions.

In the "Pathway" part:

Users can choose different pathways, including KEGG Pathway, WikiPathways, PathWhizPathway, Reactome Pathway, NetPath, PANTHER, and Pathway Interaction. For each pathway, you can click the Pathway ID to jump to the original website.

Target-Target Similarity Table

The "Target-target Similarity Prediction" subpage provides a similarity table, including the results of comparing the similarity between the queried target and other targets. The similarity comparison is calculated by pairwise alignment of sequences.

Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443-53.[PubMed]

Raghava GP, Barton GJ. Quantification of the variation in percentage identity for protein sequence alignments. BMC Bioinformatics. 2006 Sep 19;7:415.[PubMed]

Wang JZ, Du Z, Payattakool R, Yu PS, Chen CF. A new method to measure the semantic similarity of GO terms. Bioinformatics. 2007 May 15;23(10):1274-81.[PubMed]

Network

We provide a force-directed network to help researchers obtain more information directly. Drugs that interact with the queried target may bind to targets similar to the queried target, thereby treating diseases related to these targets. Similarly, hovering the mouse over a disease or a drug can reveal its name.

Target-Drug Interaction Prediction Table

Tsubaki M, Tomii K, Sese J. Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics. 2019;35:309-318.[PubMed]

Huang Y, Huang HY, Chen Y, et al. A Robust Drug-Target Interaction Prediction Framework with Capsule Network and Transfer Learning. Int J Mol Sci. 2023;24(18):14061.[PubMed]

Network

DrugRepoBank provides a network to display the association between the target and drugs. We select the top 10 proteins according to "Score mean". The linewidths show thick to thin changes depending on the high to low "Score mean". Drugs that interact with the central target may be able to treat diseases related to the central target.

Disease Search

For Disease Search, you can direct choose disease type on drop-down menu.

Disease Signature

You can view the basic information about the disease signature, including the number of genes and samples.

There is an interactive volcano plot. Users can directly hover over each dot to check the logFC, -log10(adj.PVal), and the gene name of the dot. It's very intuitive to see the upregulation and downregulation of genes.

This table includes whole gene expression profiles of disease signatures.

Explanation of the header:

① logFC: It means log fold change. When logFC is greater than 0, the gene is up-regulated, and vice versa.

② AveExpr: The average expression value of the probe group in all samples.

③ P value: It refers to the P-value of the Bayesian test.

④ adj.P.value: It is the adjusted P value, which has more reference value.

Enrichment Analysis

Enrichment Analysis includes GO analysis and KEGG analysis, and each type has 3 subpages based on up-regulated genes, down-regulated genes, and combined genes. For the result of pathway analysis, we provide a table at the bottom of the page, a bar chart of the top 20 pathways, and a bubble diagram of the top 10 pathways.

Signature-based drug repositioning

Explanation of the header:

① GSEA whight0 Score: GSEAweight0 uses |FC|^0 as the magnitude of the increase or decrease in statistical values at each step and the degree of gene expression change when calculating the enrichment score(ES).

② GSEA whight1 Score: GSEAweight1 uses |FC|^1 as the magnitude of the increase or decrease in statistical values at each step and the degree of gene expression change when calculating the enrichment score(ES).

③ GSEA whight2 Score: GSEAweight2 uses |FC|^2 as the magnitude of the increase or decrease in statistical values at each step and the degree of gene expression change when calculating the enrichment score(ES).

④ ZhangScore: This method sets rank-based weights to all genes in one gene signature.

⑤ KS Score: A nonparametric, ranked-based Kolmogorov-Smirnov (KS) statistic is used for connecting disease gene expression signatures to drug expression profiles.

⑥ XSum score: It is calculated based on the top genes ranked by fold changes of gene expression.

Lin K, Li L, Dai Y, Wang H, Teng S, Bao X, Lu ZJ, Wang D. A comprehensive evaluation of connectivity methods for L1000 data. Brief Bioinform. 2020 Dec 1;21(6):2194-2205.[PubMed]

Drug Signature Details

DRviaSPCN Algorithm

We repurpose drugs via a subpathway crosstalk network by DRviaSPCN package in R. Construct a subpathway-subpathway network induced by diseases, then calculate the eigenvector centrality score and p-value of subpathways induced by each drug to reflect the influence of subpathway crosstalk.

For evaluating drug-disease reverse association and recognizing candidates of drugs, a drug-disease reverse association score can be used to reflect the treatment extent of a drug at the subpathway level. The greater negative drug-disease reverse association score means the drug prefers to treat the disease to a larger extent. However, a drug may be applied in different conditions for the same disease. Drug enrichment score (DES) is calculated to indicate the enrichment situation based on enriching every drug-disease reverse association score of instance. If the drug enrichment score is negative, it indicates that the drug may have the same treatment for the disease.

You can click drug name to obtain more details of the drug.

Explanation of the header:

① SubPathID: The ID of subpathway. You can use plotSPW function to plot network graph of the subpathway in R.

② SubPathway: The description of the subpathway.

③ Size: The size of the subpathway.

④ Gene: The genes in the subpathway.

⑤ Centralscore: A score to reflect the influence of subpathway crosstalk.

Wu J, Li X, Wang Q, Han J. DRviaSPCN: a software package for drug repurposing in cancer via a subpathway crosstalk network. Bioinformatics. 2022 Oct 31;38(21):4975-4977.[PubMed]

DrugSim2DR Algorithm

Wu, J., Li, J., He, Y., et al. DrugSim2DR: systematic prediction of drug functional similarities in the context of specific disease for drug repurposing. GigaScience. 2022 Dec 28;12:giad104.[PubMed]

Literature

The "Literature" module is an experimentally-validated drug repositioning part through manual curation approach from the PubMed database, with the aim of providing literature support for existing repositioning drugs. It stores experimentally-validated repositioned drugs and corresponding meta-information, including old indication (the original medical condition or disease for which a drug was initially developed and approved), old target (a protein that the drug was originally designed to interact with or modulate for its initial therapeutic purpose), new indication (a different medical condition or therapeutic use for a drug that was originally developed and approved for a specific purpose), new direct target (the novel or alternative molecular target that the drug affects in the context of treating a different disease or medical condition), new indirect target (a component of a pathway or mechanism that may not be the primary intended target of the drug but is relevant to its efficacy in treating a different disease or medical condition), experiment evidence of four distinct levels (in silico experiments, in vitro experiments, in vivo experiments, and clinical trials) and the supported sentences in literature.

On the “LITERATURE” module, users can browse the whole literature table. Each record contains key information about a literature. You can click "PMID" to read the original literature or click "Details" to view more information extracted from literature by us. If you want to learn more about the drug you are interested in, click the drug name.

(1)For experimental biologists, the "LITERATURE" module assists in locating new direct targets, new indirect targets, new diseases, and experimental evidence for existing repositioned drugs with the aim of providing literature support for existing repositioning drugs and uncovering patterns in the discovery of repurposed drugs.

(2)For computational biologists, this module provides high-quality raw data and helps them develop more efficient drug repositioning algorithms. Moreover, literature can be used as the original data for text mining or Natural Language Processing (NLP) of extracting drug repositioning information.

Details

Case Study 1: Identify similar drug to treat disease through similarity-based drug repositioning

Heparin-induced thrombocytopenia (HIT) is a severe immune-mediated response that substantially amplifies the risk of clotting complications within both arteries and veins. To manage HIT, injectable medications like bivalirudin are commonly employed. Aligning with the 2018 American Society of Hematology (ASH) Guideline, incorporating oral anticoagulants is suggested to optimize cost-effectiveness and procedural simplicity for HIT [1]. A recent study [2] has demonstrated that Dabigatran could be considered a safe and effective agent in the management of HIT. Hence, we employed DrugRepoBank to investigate the similarity between dabigatran and bivalirudin as a case study, providing evidence that DrugRepoBank can identify similar drugs for disease treatment through similarity-based drug repositioning (Figure a). Consequently, we explore bivalirudin in the "Drug Search" of the "Prediction" module (Figure b). By the drug-drug similarity table presented in the "Similarity-based drug repositioning" module (Figure c), it is noteworthy that dabigatran etexilate exhibits high similarity scores (row means = 0.841856, ranking = 1st) when compared with Bivalirubin, which validates the effectiveness of similarity-based drug repositioning.

[1] Cuker A, Arepally GM, Chong BH et al. American Society of Hematology 2018 guidelines for management of venous thromboembolism: heparin-induced thrombocytopenia. Blood Adv. 2018;2:3360-3392.[PubMed]

[2] Nasiripour S, Saif M, Farasatinasab M et al. Dabigatran as a Treatment Option for Heparin-Induced Thrombocytopenia. The Journal of Clinical Pharmacology. 2019;59:107-111.[PubMed]

Case study 2: Identify new drug targets to treat new diseases through artificial intelligence-based drug repositioning

Sildenafil is a medication for the treatment of penile erectile dysfunction by inhibiting guanosine monophosphate in the corpus cavernosum [1]. Current research demonstrates that sildenafil may serve as a potential agent for the treatment of lung cancer by inhibiting CYP3A4 [2]. To demonstrate the effectiveness of AI-based methods in identifying new drug targets for treating new diseases, we investigated whether sildenafil could bind to CYP3A4 and thus serve as a potential treatment for liver cancer as a case study (Figure a). Therefore, we inputted sildenafil into the "Drug Search" of the "Prediction module". As demonstrated in Figure b, we sorted the results of "AI-based drug repositioning" by "Score mean". We identified CYP3A4 as a potential target protein with a relatively high score (score mean = 0.732, ranking = 4th), showing that the AI-based methods can effectively predict new drug targets, thereby facilitating drug repositioning.

[1] Moreland RB, Goldstein I, Traish A. Sildenafil, a novel inhibitor of phosphodiesterase type 5 in human corpus cavernosum smooth muscle cells. Life Sci. 1998;62:PL309-PL318.[PubMed]

[2] Keats T, Rosengren RJ, Ashton JC. The Rationale for Repurposing Sildenafil for Lung Cancer Treatment. Anticancer Agents Med Chem. 2018;18:367-374.[PubMed]

Case study 3: Discover new drugs for treating diseases through signature-based and network-based drug repositioning

Lung cancer ranks as the third most prevalent form of cancer and stands as the second leading cause of cancer-related fatalities [1]. Therefore, it is essential to explore drug repositioning strategies for treating lung cancer. Panobinostat exhibited a marked reduction in tumor growth, accompanied by inhibition of the cell cycle pathway and a increase in cell cycle regulators like CDKN1A (Figure a1) [2]. Verteporfin has been demonstrated to effectively restrain the proliferation and migration of lung cancer cells through the modulation of the Hippo signaling pathway and insulin secretion (Figure a2) [3]. To discover new drugs for treating diseases through signature-based and network-based drug repositioning, we explored the effectiveness of these two algorithms using the above two case studies in DrugRepoBank.

We undertook a drug repositioning study for lung cancer utilizing our DrugRepoBank database by searching "LUNG" in the "Disease Search" of the "Prediction" module (Figure b). Through the bioinformatics analysis (DEG analysis and GO and KEGG pathway enrichment analysis) of disease signature (Figure c), we found that CDKN1A, a cell cycle regulator, is down-regulated, and the cell cycle pathway is enhanced in lung cancer patients. Employing both signature-based (Figure d1) and network-based approaches (Figure d2) within our database, we identified two potential candidates, panobinostat (Rowmean = -3.4225, ranking = 30th) and verteporfin (DES = -0.9635, ranking = 7th), for the treatment of lung cancer. With bioinformatics analysis, it can be inferred that panobinostat inhibits liver cancer growth by suppressing the cell cycle pathway and upregulating CDKN1A, which aligns with reported literature [2]. Subpathway analysis indicates that verteporfin may influence lung cancer by modulating the Hippo signaling pathway and insulin secretion, consistent with literature findings [3].

[1] Bade BC, Dela Cruz CS. Lung Cancer 2020: Epidemiology, Etiology, and Prevention. Clin Chest Med. 2020;41:1-24.[PubMed]

[2] Crisanti MC, Wallace AF, Kapoor V et al. The HDAC inhibitor panobinostat (LBH589) inhibits mesothelioma and lung cancer cells in vitro and in vivo with particular efficacy for small cell lung cancer. Mol Cancer Ther. 2009;8:2221-2231.[PubMed]

[3] Wei C, Li X. The Role of Photoactivated and Non-Photoactivated Verteporfin on Tumor. Front Pharmacol. 2020;11.[PubMed]

Terminology

Prediction module: The "Prediction" module aims to achieve the goal of drug repositioning by identifying new relationships between drugs and diseases or drugs and targets through multiple algorithms (similarity-based, artificial intelligence-based, signature-based, and network-based methods) based on 3 search engines (Drug Search, Target Search, and Disease Search).

1. Drug search: It can find similar drugs through similarity-based methods and then predict new targets for similar drugs to treat new diseases. In addition, the AI-based methods can find the drug-target interactions, which can find the known or unknown targets of drugs.

2. Target search: It can find similar targets through similarity-based methods and then predict new targets for the corresponding drugs to treat new diseases. The AI-based methods can find the target-drug interactions, which can identify the known drug of the target.

3. Disease search: It can find disease-drug associations through signature-based and network-based methods. Some of these drugs are known (connected to the "Literature" module or basic information of the drug), while unknown drugs can be found to achieve new use of old drugs.

(1) Disease signature: Disease (Cancer) signatures were identified by conducting differential expression analysis for each cancer type between the cancer group and normal group using the limma package (version 3.56.2). Genes meeting the specified cutoff criteria (a minimum adjusted p-value of 0.05 and a fold change of 2) were identified as differentially expressed genes (DEGs).

(2) Signature-based drug repositioning: The hypothesis associated with the signature-based method involves the selection of a drug that has a reversal effect on the disease signature genes. Briefly, if the pattern of the gene expression induced by a drug (drug signatures) is in contrast to that induced by a disease (disease signatures), the drug will demonstrate a therapeutic value for the disease.

① sig id: A CMap unique identification number assigned to each signature generated from L1000 data.

②distil id: An ID of an individual replicate profile, referred to as level 4 / z-score data, that is used in creating the signature from replicates assayed together on an L1000 plate. The signature is referred to as level 5 / aggregated z-score data.

(3) Network-based drug repositioning: The core principle of network-based drug repositioning is that the drug candidate (Drug A) should restore the pathway signature of the disease (Disease A). The pathway signature is constructed by the GSEA (Gene Set Enrichment Analysis) algorithm, and a pathway network is constructed to calculate the centrality scores of each pathway in the network.

① DES: Drug enrichment score (DES) is computed by the Kolmogorov-Smirnov statistic based on the instance set of a given drug. If the instance set of the drug enriches at the negative region of the drug-disease reverse association score list, the DES will be strongly negative, indicating the drug in different instances may have a consistent treatment effect on the disease.

Literature module: The "Literature" module is an experimentally-validated drug repositioning database through manual curation approach from the PubMed database. It stores experimentally validated repositioned drugs and corresponding meta-information.

1. Old indication: It refers to the original medical condition or disease for which a drug was initially developed and approved.

2. Old target: It refers to a protein that the drug was originally designed to interact with or modulate for its initial therapeutic purpose.

3. New indication: It refers to a different medical condition or therapeutic use for a drug that was originally developed and approved for a specific purpose.

4. New direct target: The novel or alternative molecular target that the drug affects in the context of treating a different disease or medical condition.

5. New indirect target: A component of a pathway or mechanism that may not be the primary intended target of the drug but is relevant to its efficacy in treating a different disease or medical condition.

For terms in "Literature" module, please refers to this article: https://doi.org/10.36303/SAJAA.2022.28.5.2904