# Bibliography of computer-aided Drug Design

Updated on 7/18/2014. Currently 2130 references

## Screening / Reviews

2013 / 2012 / 2011 / 2010 / 2009 / 2008 / 2007 / 2006 / 2004 / 2003 / 2002 / 1998 /

## 2013

• The future of virtual compound screening.
Heikamp, Kathrin and Bajorath, Jürgen
Chemical biology & drug design, 2013, 81(1), 33-40
PMID: 23253129     doi: 10.1111/cbdd.12054

We provide a future perspective of the virtual screening field. A number of challenges will be highlighted that virtual screening will likely face when compound data will further grow at or beyond current rates and when much more target information will become available. These challenges go beyond computational efficiency issues (that will of course also play a critical role). For example, for structure-based approaches, the accuracy of scoring functions and energy calculations will need to be improved. For ligand-based approaches, the compound class-dependence of similarity methods needs to be further explored and relationships between molecular similarity and activity similarity need to be established. We also comment on the current and future value of virtual screening. Opportunities for further development in a postgenome era are also discussed. It is hoped that some of the views and hypotheses we articulate might stimulate further discussion about the virtual screening field going forward.

• Cheminformatics aspects of high throughput screening: from robots to models: symposium summary
Jane Tseng, Y and Martin, Eric and Bologa, Cristian and Shelat, AnangA
Journal of computer-aided molecular design, 2013, 27(5), 443-453
PMID: 23636795     doi: 10.1007/s10822-013-9646-6

The "Cheminformatics aspects of high throughput screening (HTS): from robots to models" symposium was part of the computers in chemistry technical program at the American Chemical Society National Meeting in Denver, Colorado during the fall of 2011. This symposium brought together researchers from high throughput screening centers and molecular modelers from academia and industry to discuss the integration of currently available high throughput screening data and assays with computational analysis. The topics discussed at this symposium covered the data-infrastructure at various academic, hospital, and National Institutes of Health-funded high throughput screening centers, the cheminformatics and molecular modeling methods used in real world examples to guide screening and hit-finding, and how academic and non-profit organizations can benefit from current high throughput screening cheminformatics resources. Specifically, this article also covers the remarks and discussions in the open panel discussion of the symposium and summarizes the following talks on "Accurate Kinase virtual screening: biochemical, cellular and selectivity", "Selective, privileged and promiscuous chemical patterns in high-throughput screening" and "Visualizing and exploring relationships among HTS hits using network graphs".

## 2012

• Accessible high-throughput virtual screening molecular docking software for students and educators.
Jacob, Reed B. and Andersen, Tim and McDougal, Owen M.
PLoS computational biology, 2012, 8(5), e1002499
PMID: 22693435     doi: 10.1371/journal.pcbi.1002499

We survey low cost high- throughput virtual screening (HTVS) computer programs for instructors who wish to demonstrate molecular docking in their courses. Since HTVS programs are a useful adjunct to the time consuming and expensive wet bench experiments necessary to discover new drug therapies, the topic of molecular docking is core to the instruction of biochemistry and molecular biology. The availability of HTVS programs coupled with decreasing costs and advances in computer hardware have made computational approaches to drug discovery possible at institutional and non-profit budgets. This paper focuses on HTVS programs with graphical user interfaces (GUIs) that use either DOCK or AutoDock for the prediction of DockoMatic, PyRx, DockingServer, and MOLA since their utility has been proven by the research community, they are free or affordable, and the programs operate on a range of computer platforms.

• Structure-based drug screening for G-protein-coupled receptors.
Shoichet, Brian K and Kobilka, Brian K
Trends in pharmacological sciences, 2012, 33(5), 268-272
PMID: 22503476     doi: 10.1016/j.tips.2012.03.007

G-protein-coupled receptors (GPCRs) represent a large family of signaling proteins that includes many therapeutic targets; however, progress in identifying new small molecule drugs has been disappointing. The past 4 years have seen remarkable progress in the structural biology of GPCRs, raising the possibility of applying structure-based approaches to GPCR drug discovery efforts. Of the various structure-based approaches that have been applied to soluble protein targets, such as proteases and kinases, in silico docking is among the most ready applicable to GPCRs. Early studies suggest that GPCR binding pockets are well suited to docking, and docking screens have identified potent and novel compounds for these targets. This review will focus on the current state of in silico docking for GPCRs.

• In silico design of small molecules.
Bernardo, Paul H and Tong, Joo Chuan
Methods in molecular biology (Clifton, N.J.), 2012, 800, 25-31
PMID: 21964780     doi: 10.1007/978-1-61779-349-3_3

Computational methods now play an integral role in modern drug discovery, and include the design and management of small molecule libraries, initial hit identification through virtual screening, optimization of the affinity and selectivity of hits, and improving the physicochemical properties of the lead compounds. In this chapter, we survey the most important data sources for the discovery of new molecular entities, and discuss the key considerations and guidelines for virtual chemical library design.

• Recent Trends and Applications in 3D Virtual Screening.
Ghemtio, Léo and Pérez-Nueno, Violeta I and Leroux, Vincent and Asses, Yasmine and Souchet, Michel and Mavridis, Lazaros and Maigret, Bernard and Ritchie, David W
Combinatorial chemistry & high throughput screening, 2012, 15(9), 749-769
PMID: 22934947

Virtual screening (VS) is becoming an increasingly important approach for identifying and selecting biologically active molecules against specific pharmaceutically relevant targets. Compared to conventional high throughput screening techniques, in silico screening is fast and inexpensive, and is increasing in popularity in early-stage drug discovery endeavours. This paper reviews and discusses recent trends and developments in three-dimensional (3D) receptor-based and ligand-based VS methodologies. First, we describe the concept of accessible chemical space and its exploration. We then describe 3D structural ligand-based VS techniques, hybrid approaches, and new approaches to exploit additional knowledge that can now be found in large chemogenomic databases. We also briefly discuss some potential issues relating to pharmacokinetics, toxicity profiling, target identification and validation, inverse docking, scaffold-hopping and drug re-purposing. We propose that the best way to advance the state of the art in 3D VS is to integrate complementary strategies in a single drug discovery pipeline, rather than to focus only on theoretical or computational improvements of individual techniques. Two recent 3D VS case studies concerning the LXR-$\beta$ receptor and the CCR5/CXCR4 HIV co-receptors are presented as examples, which implement some of the complementary methods and strategies that are reviewed here.

• Structure-based virtual screening for drug discovery: a problem-centric review.
Cheng, Tiejun and Li, Qingliang and Zhou, Zhigang and Wang, Yanli and Bryant, Stephen H
The AAPS journal, 2012, 14(1), 133-141
PMID: 22281989     doi: 10.1208/s12248-012-9322-0

Structure-based virtual screening (SBVS) has been widely applied in early-stage drug discovery. From a problem-centric perspective, we reviewed the recent advances and applications in SBVS with a special focus on docking-based virtual screening. We emphasized the researchers' practical efforts in real projects by understanding the ligand-target binding interactions as a premise. We also highlighted the recent progress in developing target-biased scoring functions by optimizing current generic scoring functions toward certain target classes, as well as in developing novel ones by means of machine learning techniques.

• Recognizing Pitfalls in Virtual Screening: A Critical Review.
Scior, Thomas and Bender, Andreas and Tresadern, Gary and Medina-Franco, José L and Martínez-Mayorga, Karina and Langer, Thierry and Cuanalo-Contreras, Karina and Agrafiotis, Dimitris K
Journal of chemical information and modeling, 2012, 52(4), 867-881
PMID: 22435959     doi: 10.1021/ci200528d

The aim of virtual screening (VS) is to identify bioactive compounds through computational means, by employing knowledge about the protein target (structure-based VS) or known bioactive ligands (ligand-based VS). In VS, a large number of molecules are ranked according to their likelihood to be bioactive compounds, with the aim to enrich the top fraction of the resulting list (which can be tested in bioassays afterward). At its core, VS attempts to improve the odds of identifying bioactive molecules by maximizing the true positive rate, that is, by ranking the truly active molecules as high as possible (and, correspondingly, the truly inactive ones as low as possible). In choosing the right approach, the researcher is faced with many questions: where does the optimal balance between efficiency and accuracy lie when evaluating a particular algorithm; do some methods perform better than others and in what particular situations; and what do retrospective results tell us about the prospective utility of a particular method? Given the multitude of settings, parameters, and data sets the practitioner can choose from, there are many pitfalls that lurk along the way which might render VS less efficient or downright useless. This review attempts to catalogue published and unpublished problems, shortcomings, failures, and technical traps of VS methods with the aim to avoid pitfalls by making the user aware of them in the first place.

• Flexibility and binding affinity in protein-ligand, protein-protein and multi-component protein interactions: limitations of current computational approaches.
Tuffery, Pierre and Derreumaux, Philippe
Journal of the Royal Society, Interface / the Royal Society, 2012, 9(66), 20-33
PMID: 21993006     doi: 10.1098/rsif.2011.0584

The recognition process between a protein and a partner represents a significant theoretical challenge. In silico structure-based drug design carried out with nothing more than the three-dimensional structure of the protein has led to the introduction of many compounds into clinical trials and numerous drug approvals. Central to guiding the discovery process is to recognize active among non-active compounds. While large-scale computer simulations of compounds taken from a library (virtual screening) or designed de novo are highly desirable in the post-genomic area, many technical problems remain to be adequately addressed. This article presents an overview and discusses the limits of current computational methods for predicting the correct binding pose and accurate binding affinity. It also presents the performances of the most popular algorithms for exploring binary and multi-body protein interactions.

## 2011

• Pharmacophore-based virtual screening.
Horvath, Dragos
Methods in molecular biology (Clifton, N.J.), 2011, 672, 261-298
PMID: 20838973     doi: 10.1007/978-1-60761-839-3_11

This chapter is a review of the most recent developments in the field of pharmacophore modeling, covering both methodology and application. Pharmacophore-based virtual screening is nowadays a mature technology, very well accepted in the medicinal chemistry laboratory. Nevertheless, like any empirical approach, it has specific limitations and efforts to improve the methodology are still ongoing. Fundamentally, the core idea of "stripping" functional groups of their actual chemical nature in order to classify them into very few pharmacophore types, according to their dominant physico-chemical features, is both the main advantage and the main drawback of pharmacophore modeling. The advantage is the one of simplicity - the complex nature of noncovalent ligand binding interactions is rendered intuitive and comprehensible by the human mind. Although computers are much better suited for comparisons of pharmacophore patterns, a chemist's intuition is primarily scaffold-oriented. Its underlying simplifications render pharmacophore modeling unable to provide perfect predictions of ligand binding propensities - not even if all its subsisting technical problems would be solved. Each step in pharmacophore modeling and exploitation has specific drawbacks: from insufficient or inaccurate conformational sampling to ambiguities in pharmacophore typing (mainly due to uncertainty regarding the tautomeric/protonation status of compounds), to computer time limitations in complex molecular overlay calculations, and to the choice of inappropriate anchoring points in active sites when ligand cocrystals structures are not available. Yet, imperfections notwithstanding, the approach is accurate enough in order to be practically useful and actually is the most used virtual screening technique in medicinal chemistry - notably for "scaffold hopping" approaches, allowing the discovery of new chemical classes carriers of a desired biological activity.

• Graph-based similarity concepts in virtual screening.
Hutter, Michael C
Future medicinal chemistry, 2011, 3(4), 485-501
PMID: 21452983     doi: 10.4155/fmc.11.3

Applying similarity for finding new promising compounds is a key issue in drug design. Conversely, quantifying similarity between molecules has remained a difficult task despite the numerous approaches. Here, some general aspects along with recent developments regarding similarity criteria are collected. For the purpose of virtual screening, the compounds have to be encoded into a computer-readable format that permits a comparison, according to given similarity criteria, comprising the use of the 3D structure, fingerprints, graph-based and alignment-based approaches. Whereas finding the most common substructures is the most obvious method, more recent approaches take into account chemical modifications that appear throughout existing drugs, from various therapeutic categories and targets.

## 2010

• Modeling approaches for ligand-based 3D similarity.
Future medicinal chemistry, 2010, 2(10), 1547-1561
PMID: 21426148     doi: 10.4155/fmc.10.244

3D ligand-based similarity approaches are widely used in the early phases of drug discovery for tasks such as hit finding by virtual screening or compound design with quantitative structure-activity relationships. Here in we review widely used software for performing such tasks. Some techniques are based on relatively mature technology, shape-based similarity for instance. Typically, these methods remained in the realm of the expert user, the experienced modeler. However, advances in implementation and speed have improved usability and allow these methods to be applied to databases comprising millions of compounds. There are now many reports of such methods impacting drug-discovery projects. As such, the medicinal chemistry community has become the intended market for some of these new tools, yet they may consider the wide array and choice of approaches somewhat disconcerting. Each method has subtle differences and is better suited to certain tasks than others. In this article we review some of the widely used computational methods via application, provide straightforward background on the underlying theory and provide examples for the interested reader to pursue in more detail. In the new era of preclinical drug discovery there will be ever more pressure to move faster and more efficiently, and computational approaches based on 3D ligand similarity will play an increasing role in in this process.

• Molecular shape technologies in drug discovery: methods and applications.
Ebalunode, Jerry O and Zheng, Weifan
Current topics in medicinal chemistry, 2010, 10(6), 669-679
PMID: 20337591

Shape complementarity is a critically important factor in molecular recognition among drugs and their biological receptors. The notion that molecules with similar 3D shapes tend to have similar biological activity has been recognized and implemented in computational drug discovery tools for decades. But the low computational efficiency and the lack of widely accessible software tools limited the use of early shape-matching algorithms. However, recent development of fast and accurate shape comparison tools has changed the landscape, and facilitated the wide spread use of both the ligand-based and receptor-based shape-matching technologies in drug discovery. In this article, we summarize some of the well-known shape algorithms. We first describe the computational principles for both the superposition-based and the superposition-free shape-matching methods. These include ROCS (Rapid Overlay of Compound Structures), SQ, and the CatShape method in the former category; and the shape signatures algorithm and USR (Ultrafast Shape Recognition) that belong to the latter category. We then highlight some recent validation studies and practical applications of various shape technologies. Because of the rapid development of modern shape-matching algorithms, and the increasingly affordable computational resources and software tools, we anticipate much broader use of the molecular shape technologies in future drug discovery. They will be especially useful in chemogenomics research, where large scale associations between small molecules and protein targets are studied. Thus, molecular shape technologies, together with well-defined pharmacophore constraints, can afford both efficient and effective means for drug discovery and chemical genomics research.

• Library screening by fragment-based docking.
Huang, Danzhi and Caflisch, Amedeo
Journal of molecular recognition : JMR, 2010, 23(2), 183-193
PMID: 19718684     doi: 10.1002/jmr.981

We review our computational tools for high-throughput screening by fragment-based docking of large collections of small molecules. Applications to six different enzymes, four proteases, and two protein kinases, are presented. Remarkably, several low-micromolar inhibitors were discovered in each of the high-throughput docking campaigns. Probable reasons for the lack of submicromolar inhibitors are the tiny fraction of chemical space covered by the libraries of available compounds, as well as the approximations in the methods employed for scoring, and the use of a rigid conformation of the target protein.

• Virtual Screening with AutoDock: Theory and Practice.
Cosconati, Sandro and Forli, Stefano and Perryman, Alex L and Harris, Rodney and Goodsell, David S and Olson, Arthur J
Expert opinion on drug discovery, 2010, 5(6), 597-607
PMID: 21532931     doi: 10.1517/17460441.2010.484460

IMPORTANCE TO THE FIELD: Virtual screening is a computer-based technique for identifying promising compounds to bind to a target molecule of known structure. Given the rapidly increasing number of protein and nucleic acid structures, virtual screening continues to grow as an effective method for the discovery of new inhibitors and drug molecules. AREAS COVERED IN THIS REVIEW: We describe virtual screening methods that are available in the AutoDock suite of programs, and several of our successes in using AutoDock virtual screening in pharmaceutical lead discovery. WHAT THE READER WILL GAIN: A general overview of the challenges of virtual screening is presented, along with the tools available in the AutoDock suite of programs for addressing these challenges. TAKE HOME MESSAGE: Virtual screening is an effective tool for the discovery of compounds for use as leads in drug discovery, and the free, open source program AutoDock is an effective tool for virtual screening.

• Advances in 2D fingerprint similarity searching.
Geppert, Hanna and Bajorath, Jürgen
Expert opinion on drug discovery, 2010, 5(6), 529-542
PMID: 22823165     doi: 10.1517/17460441.2010.486830

Importance to the field: Similarity searching is one of the premier approaches for computational hit identification. Fingerprints are bit string representations of molecular structure and properties and rather simplistic search tools. Nevertheless, they are widely used and often surprisingly successful in drug discovery applications. Areas covered in this review: Herein we discuss recent research efforts that have helped to better understand fingerprint search performance, design new fingerprints and search strategies, or modify standard fingerprints for specific applications. Key publications of the past ∼ 20 years are covered and major emphasis is put on reviewing fingerprint studies published during the past 5 years. What the reader will gain: The reader is provided with an overview of the state-of-the-art fingerprint design and search strategies developed. It will be possible to rationalize opportunities and limitations of 2D fingerprint similarity searching. Take home messages: Fingerprint search calculations are more complex than it might appear at first glance and susceptible to complications that are often overlooked in practical applications. Fingerprint search performance typically only depends on relatively small subsets of bit positions. Recently, different fingerprint engineering strategies have been applied to 'tune' existing fingerprints in a compound class-directed manner. Fingerprints have substantial scaffold hopping potential, despite the simplicity of their design.

• Computational methodologies for compound database searching that utilize experimental protein-ligand interaction information.
Tan, Lu and Batista, José and Bajorath, Jürgen
Chemical biology & drug design, 2010, 76(3), 191-200
PMID: 20636330     doi: 10.1111/j.1747-0285.2010.01007.x

Ligand- and target structure-based methods are widely used in virtual screening, but there is currently no methodology available that fully integrates these different approaches. Herein, we provide an overview of various attempts that have been made to combine ligand- and structure-based computational screening methods. We then review different types of approaches that utilize protein-ligand interaction information for database screening and filtering. Interaction-based approaches make use of a variety of methodological concepts including pharmacophore modeling and direct or indirect encoding of protein-ligand interactions in fingerprint formats. These interaction-based methods have been successfully applied to tackle different tasks related to virtual screening including postprocessing of docking poses, prioritization of binding modes, selectivity analysis, or similarity searching. Furthermore, we discuss the recently developed interacting fragment approach that indirectly incorporates 3D interaction information into 2D similarity searching and bridges between ligand- and structure-based methods.

• Quo vadis, virtual screening? A comprehensive survey of prospective applications.
Ripphausen, Peter and Nisius, Britta and Peltason, Lisa and Bajorath, Jürgen
Journal of medicinal chemistry, 2010, 53(24), 8461-8467
PMID: 20929257     doi: 10.1021/jm101020z

• Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation.
Geppert, Hanna and Vogt, Martin and Bajorath, Jürgen
Journal of chemical information and modeling, 2010, 50(2), 205-216
PMID: 20088575     doi: 10.1021/ci900419k

• Virtual screening: an endless staircase?
Schneider, Gisbert
Nature reviews. Drug discovery, 2010, 9(4), 273-276
PMID: 20357802     doi: 10.1038/nrd3139

Computational chemistry - in particular, virtual screening - can provide valuable contributions in hit- and lead-compound discovery. Numerous software tools have been developed for this purpose. However, despite the applicability of virtual screening technology being well established, it seems that there are relatively few examples of drug discovery projects in which virtual screening has been the key contributor. Has virtual screening reached its peak? If not, what aspects are limiting its potential at present, and how can significant progress be made in the future?

## 2009

• Docking-based virtual screening: recent developments.
Tuccinardi, Tiziano
Combinatorial chemistry & high throughput screening, 2009, 12(3), 303-314
PMID: 19275536

Virtual (database) screening (VS) of molecules promises to accelerate the discovery of new drugs and reduce costs by identifying molecules with high probabilities of binding to a target receptor. The large amount of available protein X-ray crystal structures, together with the development of more effective homology modelling techniques, has led recently to a steep increase in docking-based VS studies. This approach needs computational fitting of molecules into a receptor active site using advanced algorithms, followed by the scoring and ranking of these molecules to identify potential leads. In this review, the main published docking-based VS studies developed over the last eight years are investigated, and details are provided about the software used, the results achieved and the novel methods employed.

• Docking Screens: Right for the Right Reasons?
Kolb, Peter and Irwin, John J
Current topics in medicinal chemistry, 2009, 9(9), 755-770

Whereas docking screens have emerged as the most practical way to use protein structure for ligand discovery, an inconsistent track record raises questions about how well docking actually works. In its favor, a growing number of publications report the successful discovery of new ligands, often supported by experimental affinity data and controls for artifacts. Few reports, however, actually test the underlying structural hypotheses that docking makes. To be successful and not just lucky, prospective docking must not only rank a true ligand among the top scoring compounds, it must also correctly orient the ligand so the score it receives is biophysically sound. If the correct binding pose is not predicted, a skeptic might well infer that the discovery was serendipitous. Surveying over 15 years of the docking literature, we were surprised to discover how rarely sufficient evidence is presented to establish whether docking actually worked for the right reasons. The paucity of experimental tests of theoretically predicted poses undermines confidence in a technique that has otherwise become widely accepted. Of course, solving a crystal structure is not always possible, and even when it is, it can be a lot of work, and is not readily accessible to all groups. Even when a structure can be determined, investigators may prefer to gloss over an erroneous structural prediction to better focus on their discovery. Still, the absence of a direct test of theory by experiment is a loss for method developers seeking to understand and improve docking methods. We hope this review will motivate investigators to solve structures and compare them with their predictions whenever possible, to advance the field.

• Docking and chemoinformatic screens for new ligands and targets
Kolb, Peter and Ferreira, Rafaela S and Irwin, John J and Shoichet, Brian K
Current Opinion in Biotechnology, 2009, 20(4), 429-436
doi: 10.1016/j.copbio.2009.08.003

... rate of 24% [19 * ] (Figure 3). Intriguingly, five of these were inverse agonists, as was the ligand bound in the X-ray structure, carazolol, against which the screen occurred. ... This is borne out in a community-wide, blind assessment (GPCR Dock 2008 [41]) of the prediction of the ...

• Docking and chemoinformatic screens for new ligands and targets
Kolb, Peter and Ferreira, Rafaela S and Irwin, John J and Shoichet, Brian K
Current Opinion in Biotechnology, 2009, 20(4), 429-436
doi: 10.1016/j.copbio.2009.08.003

... rate of 24% [19 * ] (Figure 3). Intriguingly, five of these were inverse agonists, as was the ligand bound in the X-ray structure, carazolol, against which the screen occurred. ... This is borne out in a community-wide, blind assessment (GPCR Dock 2008 [41]) of the prediction of the ...

• Structure-based drug screening and ligand-based drug screening with machine learning.
Fukunishi, Yoshifumi
Combinatorial chemistry & high throughput screening, 2009, 12(4), 397-408
PMID: 19442067

The initial stage of drug development is the hit (active) compound search from a pool of millions of compounds; for this process, in silico (virtual) screening has been successfully applied. One of the problems of in silico screening, however, is the low hit ratio in relation to the high computational cost and the long CPU time. This problem becomes serious in structure-based in silico screening. The major reason is the low accuracy of the estimation of protein-compound binding free energy. The problem of ligand-based in silico screening is that the conventional quantitative structure-activity relationship (QSAR) approach is not effective at predicting new hit compounds with new scaffolds. Recently, machine-learning approaches have been applied to in silico drug screening to overcome the above problems. We review here machine-learning approaches for both structure-based and ligand-based drug screening. Machine learning is used to improve database enrichment in two ways, namely by improving the docking score calculated by the protein-compound docking program and by calculating the optimal distance between the feature vectors of active and inactive compounds. Both approaches require compounds that are known to be active with respect to the target protein. In structure-based screening, the former approach is mainly used with a protein-compound affinity matrix. In ligand-based screening, both the former and latter approaches are used, and the latter approach can be applied to various kinds of descriptors, such as 1D/2D descriptors/fingerprints and the affinity fingerprint given by the protein-compound affinity matrix.

• Performance of machine learning methods for ligand-based virtual screening.
Plewczynski, Dariusz and Spieser, Stéphane A H and Koch, Uwe
Combinatorial chemistry & high throughput screening, 2009, 12(4), 358-368
PMID: 19442065

Computational screening of compound databases has become increasingly popular in pharmaceutical research. This review focuses on the evaluation of ligand-based virtual screening using active compounds as templates in the context of drug discovery. Ligand-based screening techniques are based on comparative molecular similarity analysis of compounds with known and unknown activity. We provide an overview of publications that have evaluated different machine learning methods, such as support vector machines, decision trees, ensemble methods such as boosting, bagging and random forests, clustering methods, neuronal networks, naïve Bayesian, data fusion methods and others.

• Structure-Based Virtual Ligand Screening: Recent Success Stories
Villoutreix, Bruno O. and Eudes, Richard and Miteva, Maria A.
Combinatorial chemistry & high throughput screening, 2009, 12(10), 1000-1016
doi: 10.2174/138620709789824682

Today, computational methods are commonly used in all areas of health science research. Among these methods, virtual ligand screening has become an established technique for hit discovery and optimization. In this review, we first introduce structure-based virtual ligand screening and briefly comment on compound collections and target preparations. We also provide the readers with a list of resources, from chemoinformatics packages to compound collections, which could be helpful to implement a structure-based virtual screening platform. Then we discuss seventeen recent success stories obtained with various receptor-based in silico methods, performed on experimental structures (Xray crystallography, 12 cases) or homology models (5 cases) and concerning different target classes, from the design of catalytic site inhibitors to drug-like compounds impeding macromolecular interactions. In light of these results, some suggestions are made about areas that present opportunities for improvements.

• Docking, virtual high throughput screening and in silico fragment-based drug design.
Zoete, Vincent and Grosdidier, Aurélien and Michielin, Olivier
Journal of cellular and molecular medicine, 2009, 13(2), 238-248
PMID: 19183238     doi: 10.1111/j.1582-4934.2008.00665.x

The drug discovery process has been profoundly changed recently by the adoption of computational methods helping the design of new drug candidates more rapidly and at lower costs. In silico drug design consists of a collection of tools helping to make rational decisions at the different steps of the drug discovery process, such as the identification of a biomolecular target of therapeutical interest, the selection or the design of new lead compounds and their modification to obtain better affinities, as well as pharmacokinetic and pharmacodynamic properties. Among the different tools available, a particular emphasis is placed in this review on molecular docking, virtual high-throughput screening and fragment-based ligand design.

## 2008

• Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection-what can we learn from earlier mistakes?
Kirchmair, Johannes and Markt, Patrick and Distinto, Simona and Wolber, Gerhard and Langer, Thierry
Journal of computer-aided molecular design, 2008, 22(3-4), 213-228
PMID: 18196462     doi: 10.1007/s10822-007-9163-6

Within the last few years a considerable amount of evaluative studies has been published that investigate the performance of 3D virtual screening approaches. Thereby, in particular assessments of protein-ligand docking are facing remarkable interest in the scientific community. However, comparing virtual screening approaches is a non-trivial task. Several publications, especially in the field of molecular docking, suffer from shortcomings that are likely to affect the significance of the results considerably. These quality issues often arise from poor study design, biasing, by using improper or inexpressive enrichment descriptors, and from errors in interpretation of the data output. In this review we analyze recent literature evaluating 3D virtual screening methods, with focus on molecular docking. We highlight problematic issues and provide guidelines on how to improve the quality of computational studies. Since 3D virtual screening protocols are in general assessed by their ability to discriminate between active and inactive compounds, we summarize the impact of the composition and preparation of test sets on the outcome of evaluations. Moreover, we investigate the significance of both classic enrichment parameters and advanced descriptors for the performance of 3D virtual screening methods. Furthermore, we review the significance and suitability of RMSD as a measure for the accuracy of protein-ligand docking algorithms and of conformational space sub sampling algorithms.

• Ligand-based approaches in virtual screening
Douguet, Dominique
Current computer-aided drug design, 2008, 4(3), 180-190

Although there are many more receptor structures than there were in the 1970s and 1980s, drug discovery remains dominated by empirical screening and substrate-based drug design. Computer-aided drug design methods have become value-adding disciplines that now contribute to the early stage of the drug discovery process [1, 2]. Computational methods encompass all aspects of drug discovery from target assessment to lead optimization. The computational strategy varies from case to case and can be influenced by several situational variables: lead hunting or lead optimization, requirement for a novel lead class, type of biological assay, structural information available, known classes of ligands, allocated chemistry resources. Today, drug discovery is still a complex and approximate science. Thus, incorporating knowledge-based approaches like ligand-based screenings may bias the process towards success. This review describes these strategies with practical applications and presents future perspectives of ligand-based screening.

• Virtual screening for the discovery of bioactive natural products.
Rollinger, Judith M and Stuppner, Hermann and Langer, Thierry
Progress in drug research. Fortschritte der Arzneimittelforschung. Progrès des recherches pharmaceutiques, 2008, 65, 211, 213-49
PMID: 18084917

In this survey the impact of the virtual screening concept is discussed in the field of drug discovery from nature. Confronted by a steadily increasing number of secondary metabolites and a growing number of molecular targets relevant in the therapy of human disorders, the huge amount of information needs to be handled. Virtual screening filtering experiments already showed great promise for dealing with large libraries of potential bioactive molecules. It can be utilized for browsing databases for molecules fitting either an established pharmacophore model or a three dimensional (3D) structure of a macromolecular target. However, for the discovery of natural lead candidates the application of this in silico tool has so far almost been neglected. There are several reasons for that. One concerns the scarce availability of natural product (NP) 3D databases in contrast to synthetic libraries; another reason is the problematic compatibility of NPs with modern robotized high throughput screening (HTS) technologies. Further arguments deal with the incalculable availability of pure natural compounds and their often too complex chemistry. Thus research in this field is time-consuming, highly complex, expensive and ineffective. Nevertheless, naturally derived compounds are among the most favorable source of drug candidates. A more rational and economic search for new lead structures from nature must therefore be a priority in order to overcome these problems. Here we demonstrate some basic principles, requirements and limitations of virtual screening strategies and support their applicability in NP research with already performed studies. A sensible exploitation of the molecular diversity of secondary metabolites however asks for virtual screening concepts that are interfaced with well-established strategies from classical pharmacognosy that are used in an effort to maximize their efficacy in drug discovery. Such integrated virtual screening workflows are outlined here and shall help to motivate NP researchers to dare a step towards this powerful in silico tool.

• Synergies of Virtual Screening Approaches
Muegge, Ingo
Mini Reviews in Medicinal Chemistry, 2008, 8(9), 927-933
doi: 10.2174/138955708785132792

Virtual screening is a knowledge driven approach. Therefore, synergies between different virtual screening methods using information about the drug target as well as about known ligands in combination promise the best results. Finding novel active scaffolds is often a more important success criterion than hit rates of virtual screens. Novelty should also be considered in balance with often weaker activities of virtual screening hits. Virtual screening is most effective if performed in iterations following up on weak primary hits of interest through testing of structural analogs and additional synthesis of compounds.

• Essential factors for successful virtual screening
Seifert, MHJ
Mini Reviews in Medicinal Chemistry, 2008, 8(1), 63-72

Virtual high-throughput screening (vHTS) is a powerful technique for identifying hit molecules as starting points for medicinal chemistry. Numerous successful applications of vHTS have been published using a large variety of methodologies. This review attempts to identify the essential factors for successful virtual screening in the hit identification phase.

• Towards improving compound selection in structure-based virtual screening
WASZKOWYCZ, B
Drug discovery today, 2008, 13(5/6), 219-226

Structure-based virtual screening is now an established technology for supporting hit finding and lead optimisation in drug discovery. Recent validation studies have highlighted the poor performance of currently used scoring functions in estimating binding affinity and hence in ranking large datasets of docked ligands. Progress in the analysis of large datasets can be made through the use of appropriate data mining techniques and the derivation of a broader range of descriptors relevant to receptor-ligand binding. In addition, simple scoring functions can be supplemented by simulation-based scoring protocols. Developments in workflow design allow the automation of repetitive tasks, and also encourage the routine use of simulation-based methods and the rapid prototyping of novel modelling and analysis procedures.

## 2007

• Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches.
Eckert, Hanna and Bajorath, Jürgen
Drug discovery today, 2007, 12(5-6), 225-233
PMID: 17331887     doi: 10.1016/j.drudis.2007.01.011

The success of ligand-based virtual-screening calculations is influenced highly by the nature of target-specific structure-activity relationships. This might pose severe constraints on the ability to recognize diverse structures with similar activity. Accordingly, the performance of similarity-based methods strongly depends on the class of compound that is studied, and approaches of different design and complexity often produce, overall, equally good (or bad) results. However, it is also found that there is often little overlap in the similarity relationships detected by different approaches, which rationalizes the need to develop alternative similarity methods. Among others, these include novel algorithms to navigate high-dimensional chemical spaces, train similarity calculations on specific compound classes, and detect remote similarity relationships.

• Processing of small molecule databases for automated docking.
Cummings, Maxwell D and Gibbs, Alan C and Desjarlais, Renee L
Medicinal chemistry (Sh{\, 2007, 3(1), 107-113
PMID: 17266630

Virtual screening involves the mining of small molecule databases from various sources. The small molecule databases used in virtual screening are typically processed, from simple 2D representations, to maximize their information content and to optimize them for input to the particular virtual screening technology being used. Processing interprets or adds molecular information related to connectivity, stereochemistry, protonation, tautomers and conformation. For virtual screening with an automated docking protocol, a technique that relies on specific intermolecular atom-atom contacts for ranking molecules, it is expected that the pre-processing protocol can affect the results of the docking experiment. The possible effects of processing on docking results have not been extensively studied, and this topic has only recently emerged as a significant aspect of the docking-based virtual screening process. One recent report highlights significant effects of different processing procedures on docking enrichment, while another outlines a general ligand preparation strategy. Here we survey and comment on recent practice in the field.

• Shapes of things: computer modeling of molecular shape in drug discovery.
Putta, Santosh and Beroza, Paul
Current topics in medicinal chemistry, 2007, 7(15), 1514-1524
PMID: 17897038

We review recent advances in computer modeling of molecular shape in drug discovery. We summarize the ways of representing shape computationally, discuss the various means of aligning molecules and shapes, consider the various ways of scoring similarity of shapes, and describe the ways in which these shapes can be used to construct molecular descriptors. Finally, we evaluate the success of these methods to date, suggest when they are best applied, and provide our recommendations for the direction of future work.

• Ligand docking and structure-based virtual screening in drug discovery.
Cavasotto, Claudio N and Orry, Andrew J W
Current topics in medicinal chemistry, 2007, 7(10), 1006-1014
PMID: 17508934

Ligand-docking-based methods are starting to play a critical role in lead discovery and optimization, thus resulting in new 'drug-candidates'. They offer the possibility to go beyond the pool of existing active compounds, and thus find novel chemotypes. A brief tutorial on ligand docking and structure-based virtual screening is presented highlighting current problems and limitations, together with the most recent methodological and algorithmic developments in the field. Recent successful applications of docking-based tools for hit discovery, lead optimization and target-biased library design are also presented. Special consideration is devoted to ongoing efforts to account for protein flexibility in structure-based virtual screening.

• Virtual screening in drug discovery - a computational perspective.
A Srinivas Reddy and S Priyadarshini Pati and P Praveen Kumar andH.N. Pradeep} and G Narahari Sastry
Current Protein & Peptide Science, 2007, 8(4), 329-351
PMID: 17696867     doi: 10.2174/138920307781369427

Virtual screening emerged as an important tool in our quest to access novel drug like compounds. There are a wide range of comparable and contrasting methodological protocols available in screening databases for the lead compounds. The number of methods and software packages which employ the target and ligand based virtual screening are increasing at a rapid pace. However, the general understanding on the applicability and limitations of these methodologies is not emerging as fast as the developments of various methods. Therefore, it is extremely important to compare and contrast various protocols with practical examples to gauge the strength and applicability of various methods. The review provides a comprehensive appraisal on several of the available virtual screening methods to-date. Recent developments of the docking and similarity based methods have been discussed besides the descriptor selection and pharmacophore based searching. The review touches upon the application of statistical, graph theory based methods machine learning tools in virtual screening and combinatorial library design. Finally, several case studies are undertaken where the virtual screening technology has been applied successfully. A critical analysis of these case studies provides a good platform to estimate the applicability of various virtual screening methods in the new lead identification and optimization.

• Virtual screening strategies in drug discovery
McInnes, C
Current opinion in chemical biology, 2007, 11, 494-502

The identification of novel therapeutic targets and characterization of their 3D structures is increasing at a dramatic rate. Computational screening methods continue to be developed and improved as credible and complementary alternatives to high-throughput biochemical compound screening (HTS). While the majority of drug candidates currently being developed have been found using HTS methods, high-throughput docking and pharmacophore- based searching algorithms are gaining acceptance and becoming a major source of lead molecules in drug discovery. Refinements and optimization of high-throughput docking methods have lead to improvements in reproducing experimental data and in hit rates obtained, validating their use in hit identification. In parallel with virtual screening methods, concomitant developments in cheminformatics including identification, design and manipulation of drug-like small molecule libraries have been achieved. Herein, currently used in silico screening techniques and their utility on a comparative and target dependent basis is discussed.

## 2006

• Virtual ligand screening: strategies, perspectives and limitations
Klebe, Gerhard
Drug discovery today, 2006, 11(13-14), 580-594
doi: 10.1016/j.drudis.2006.05.012

... The expression ' virtual screening ' (VS) was coined in the late 1990s; however, the techniques involved are ... In an effort to show that searching for lead candidates using a computer is a ... their binding to a macromolecular target using computer programs (in drug discovery , the term ...

• Virtual ligand screening: strategies, perspectives and limitations
Klebe, Gerhard
Drug discovery today, 2006, 11(13-14), 580-594
doi: 10.1016/j.drudis.2006.05.012

... The expression ' virtual screening ' (VS) was coined in the late 1990s; however, the techniques involved are ... In an effort to show that searching for lead candidates using a computer is a ... their binding to a macromolecular target using computer programs (in drug discovery , the term ...

• Scoring functions for protein-ligand docking.
Jain, Ajay N
Current Protein & Peptide Science, 2006, 7(5), 407-420
PMID: 17073693

Virtual screening by molecular docking has become established as a method for drug lead discovery and optimization. All docking algorithms make use of a scoring function in combination with a method of search. Two theoretical aspects of scoring function performance dominate operational performance. The first is the degree to which a scoring function has a global extremum within the ligand pose landscape at the proper location. The second is the degree to which the magnitude of the function at the extremum is accurate. Presuming adequate search strategies, a scoring function's location performance will dominate behavior with respect to docking accuracy: the degree to which a predicted pose of a ligand matches experimental observation. A scoring function's magnitude performance will dominate behavior with respect to screening utility: enrichment of true ligands over non-ligands. Magnitude estimation also controls pure scoring accuracy: the degree to which bona fide ligands of a particular protein may be correctly ranked. Approaches to the development of scoring functions have varied widely, with a number of functions yielding similarly high levels of performance relating to the location issue. However, even among functions performing equally well on location, widely varying performance is observed on the question of magnitude. In many cases, performance is good enough to yield high enrichments of true ligands versus non-ligands in screening across a wide variety of protein types. Generally, performance is not good enough to correctly rank among true ligands. Strategies for improvement are discussed.

• Molecular descriptors and methods for ligand based virtual high throughput screening in drug discovery.
Pozzan, Alfonso
Current pharmaceutical design, 2006, 12(17), 2099-2110
PMID: 16796558

The aim of virtual high throughput screening is the identification of biologically relevant molecules amongst either tangible or virtual (large) collections of compounds. Amongst the various virtual screening approaches, those that are ligand based are becoming very popular due to the possibility to screen millions of molecules in a timely way. Descriptors and methods are briefly introduced and reviewed with more emphasis for those approaches that are based on fingerprint descriptors and that seems to be more utilized during the drug discovery process.

• Similarity-based virtual screening using 2D fingerprints
Willett, Peter
Drug discovery today, 2006, 11(23-24), 1046-1053
PMID: 17129822     doi: 10.1016/j.drudis.2006.10.005

... screening system: the popular structure- based approaches, such as docking and de ... Examples of ligand- based approaches include: pharmacophore methods, which involve the identification ... containing known active and known inactive molecules; and the similarity methods that ...

• Similarity-based virtual screening using 2D fingerprints
Willett, Peter
Drug discovery today, 2006, 11(23-24), 1046-1053
PMID: 17129822     doi: 10.1016/j.drudis.2006.10.005

... screening system: the popular structure- based approaches, such as docking and de ... Examples of ligand- based approaches include: pharmacophore methods, which involve the identification ... containing known active and known inactive molecules; and the similarity methods that ...

• Virtual Screening: Are We There Yet?
Jalaie, M
Mini Reviews in Medicinal\ldots}, 2006, 6(10), 1159-1167

The cost of pharmaceutical development has increased dramatically in recent years, and many assorted approaches have been developed to decrease both the time and costs associated with bringing a drug to the market. Among these methods is the use of in silico screening of compound databases for potential new lead compounds, commonly referred to as virtual screening (VS). Virtual screening has become an integral part of the early discovery process in pharmaceutical development, readily observed by the large number of methodologies that have been published to date. Other reviews have been published detailing the various types of virtual screening methods in use. This work will review some of the virtual screening approaches and strategies that have been attempted to identify compounds to launch medicinal chemistry campaigns. Understanding trends and drivers in VS should help to set expectations about how and when VS could be used and what it can and cannot deliver and how it can be integrated in a successful screening campaign and used in a complementary fashion to HTS.

## 2004

• Docking and scoring in virtual screening for drug discovery: methods and applications.
Kitchen, Douglas B and Decornez, Hélène and Furr, John R and Bajorath, Jürgen
Nature reviews. Drug discovery, 2004, 3(11), 935-949
PMID: 15520816     doi: 10.1038/nrd1549

Computational approaches that 'dock' small molecules into the structures of macromolecular targets and 'score' their potential complementarity to binding sites are widely used in hit identification and lead optimization. Indeed, there are now a number of drugs whose development was heavily influenced by or based on structure-based design and screening strategies, such as HIV protease inhibitors. Nevertheless, there remain significant challenges in the application of these approaches, in particular in relation to current scoring schemes. Here, we review key concepts and specific features of small-molecule-protein docking methods, highlight selected applications and discuss recent advances that aim to address the acknowledged limitations of established approaches.

• Virtual screening of chemical libraries
Shoichet, Brian K
Nature\ldots}, 2004, 432(7019), 862-865
PMID: 15602552     doi: 10.1038/nature03197

Virtual screening uses computer-based methods to discover new ligands on the basis of biological structures. Although widely heralded in the 1970s and 1980s, the technique has since struggled to meet its initial promise, and drug discovery remains dominated by ...

• Virtual screening of chemical libraries
Shoichet, Brian K
Nature\ldots}, 2004, 432(7019), 862-865
PMID: 15602552     doi: 10.1038/nature03197

Virtual screening uses computer-based methods to discover new ligands on the basis of biological structures. Although widely heralded in the 1970s and 1980s, the technique has since struggled to meet its initial promise, and drug discovery remains dominated by ...

## 2003

• Hit and lead generation: beyond high-throughput screening.
Bleicher, Konrad H and Böhm, Hans-Joachim and Müller, Klaus and Alanine, Alexander I
Nature reviews. Drug discovery, 2003, 2(5), 369-378
PMID: 12750740     doi: 10.1038/nrd1086

The identification of small-molecule modulators of protein function, and the process of transforming these into high-content lead series, are key activities in modern drug discovery. The decisions taken during this process have far-reaching consequences for success later in lead optimization and even more crucially in clinical development. Recently, there has been an increased focus on these activities due to escalating downstream costs resulting from high clinical failure rates. In addition, the vast emerging opportunities from efforts in functional genomics and proteomics demands a departure from the linear process of identification, evaluation and refinement activities towards a more integrated parallel process. This calls for flexible, fast and cost-effective strategies to meet the demands of producing high-content lead series with improved prospects for clinical success.

## 2002

• Virtual screening and fast automated docking methods
Schneider, Gisbert and Böhm, Hans-Joachim
Drug discovery today, 2002, 7(1), 64-70
doi: 10.1016/S1359-6446(01)02091-8

... molecules which were identified, optimized or designed using virtual screening methods a. Molecular structure, Activity, Method, Refs. Ca 2+ antagonist (T-channel blocker), Pharmacophore similarity searching, [51]. K + channel (kv 1.5) blocker, Fragment based evolutionary de novo ...

• Virtual screening and fast automated docking methods
Schneider, Gisbert and Böhm, Hans-Joachim
Drug discovery today, 2002, 7(1), 64-70
doi: 10.1016/S1359-6446(01)02091-8

... molecules which were identified, optimized or designed using virtual screening methods a. Molecular structure, Activity, Method, Refs. Ca 2+ antagonist (T-channel blocker), Pharmacophore similarity searching, [51]. K + channel (kv 1.5) blocker, Fragment based evolutionary de novo ...

• Structure-based virtual screening: an overview
Lyne, Paul D
Drug discovery today, 2002, 7(20), 1047-1055
doi: 10.1016/S1359-6446(02)02483-2

... Typically for docking , the physical- based scoring functions (eg Dock [29] and QXP [23]) employ force-fields in a minimalistic manner on a grid with no ... Empirical- based scoring functions based on physicochemical properties such as hydrogen - bond counts (eg ...

• Structure-based virtual screening: an overview
Lyne, Paul D
Drug discovery today, 2002, 7(20), 1047-1055
doi: 10.1016/S1359-6446(02)02483-2

... Typically for docking , the physical- based scoring functions (eg Dock [29] and QXP [23]) employ force-fields in a minimalistic manner on a grid with no ... Empirical- based scoring functions based on physicochemical properties such as hydrogen - bond counts (eg ...

• Protein flexibility and drug design: how to hit a moving target
Carlson, HA
Current opinion in chemical biology, 2002, 6, 447-452

The most advanced methods for computer-aided drug design and database mining incorporate protein flexibility. Such techniques are not only needed to obtain proper results; they are also critical for dealing with the growing body of information from structural genomics.

## 1998

• Virtual screening-an overview
Walters, WP and Stahl, MT
Drug discovery today, 1998, 3(4), 160-178

Recent advances in combinatorial chemistry and high- throughput screening have made it possible for chemists to synthesize large numbers of compounds. However, this is still a small percentage of the total number that could be synthesized. Virtual screening encompasses a variety of computational techniques that allow chemists to reduce a huge virtual library to a more manageable size. This review presents the current state of the art in virtual screening and discusses approaches that will allow the evaluation of larger numbers of compounds