An Open Drug Discovery Competition

: The Open Source Malaria (OSM) consortium is developing compounds that kill the human malaria parasite, Plasmodium falciparum , by targeting Pf ATP4, an essential ion pump on the parasite surface. The structure of Pf ATP4 has not been determined. Here, we describe a public competition created to develop a predictive model for the identi ﬁ cation of Pf ATP4 inhibitors, thereby reducing project costs associated with the synthesis of inactive compounds. Competition participants could see all entries as they were submitted. In the ﬁ nal round, featuring private sector entrants specializing in machine learning methods, the best-performing models were used to predict novel inhibitors, of which several were synthesized and evaluated against the parasite. Half possessed biological activity, with one featuring a motif that the human chemists familiar with this series would have dismissed as “ ill-advised ” . Since all data and participant interactions remain in the public domain, this research project “ lives ” and may be improved by others.


■ INTRODUCTION
Efficiency in the early stages of the drug discovery pipeline, from hit identification to lead optimization, is key to the development of new drugs.The initial identification of a hit compound is typically carried out using one of two approaches.In target-based drug discovery, the molecular target of interest is known. 1 With this knowledge, libraries containing many compounds are screened (experimentally or computationally) against the known target to identify promising candidates or chemical scaffolds for further development.Through testing these chemicals, the key binding interactions may be identified and more directed structure−activity relationship (SAR) studies can be conducted to optimize activity.
Alternatively, if the biological target is not known, phenotypic drug discovery may be undertaken. 2This process involves the initial identification of potent compounds that give rise to the desired effect (e.g., inhibition of cell growth), with target determination performed thereafter.The leadoptimization phase in this type of drug discovery is less streamlined than that in the former method as it is conducted without guidance from target binding interactions and often relies upon the intuition of the medicinal chemist to design and synthesize compounds to explore the SAR.There are a number of obvious limitations to this approach, including the personal bias/imagination of the scientist or the availability/cost of resources.As a result, good hypotheses or key insights may be overlooked, which can lengthen the time taken to identify a lead candidate and increase costs associated with synthesizing complex molecules that are later revealed to be inactive.Nevertheless, the advantage of phenotypic drug discovery, which underpins its popularity, is that hit or lead compounds are already known to be effective in their overall role (e.g., the killing of a pathogen).
To aid this latter approach and overcome the absence of knowledge of the target or its structure, computational models may be developed using artificial intelligence (AI) and machine learning (ML). 3,4Such approaches allow the activities of new compounds in a phenotypic-screening program to be predicted.For instance, matched molecular pair analysis 5 and quantitative structure−activity relationship (QSAR) 6 models are commonly used in medicinal chemistry campaigns to determine the relationships between the physical and biological properties of a series of compounds.This information can then be used to guide the design of new active compounds.In those cases in which a target has been identified, but its structure is not yet determined, a structural model may be developed based on a known close homolog of the target. 7his method allows for docking studies to be conducted to examine potential binding interactions that may occur in the actual target, thus guiding the lead-optimization process more effectively.−11 For instance, there have been successes in the in silico target prediction of small molecules with activity against Mycobacterium tuberculosis. 12,13n the case of the malaria parasite, the development of resistance to frontline treatments is an ever-present problem.Since the isolation of artemisinin from the plant Artemisia annua in 1971 by Tu Youyou and colleagues, 14 this natural product and its derivatives have been used in some of the most effective treatments for malaria.The artemisinin-based combination therapies (ACTs) utilize a short-acting artemisinin derivative in combination with one or more complementary antimalarials that are long acting and possess a different mechanism of action (MoA).The use of these combinations has, in part, been responsible for the slow development of resistance to ACTs; yet in recent years, increasing numbers of cases have emerged of reduced efficacy. 15There is an urgent need for new medicines that possess novel MoAs. 16e promising biological target in the human malaria parasite, Plasmodium falciparum, is the essential P-type ATPase PfATP4, which localizes to the plasma membrane of the intraerythrocytic parasite and exports Na + while importing H + equivalents. 17,18The structure of this membrane-bound protein remains unsolved.Evidence for the involvement of PfATP4 in the mechanism of action of a wide range of antiplasmodial compounds identified in phenotypic screens comes from several sources, including from analysis of mutations in resistant lines and from a range of physiological and biochemical assays (measurement of parasite cytosolic Na + concentration ([Na + ]) and pH, as well as parasite volume and Na + -ATPase activity).PfATP4 has been implicated as the target for spiroindolone cipargamin 17,19 (currently in Phase III clinical development), dihydroisoquinolone (+)-SJ733, 20 and 28 compounds from the Medicines for Malaria Venture (MMV) Malaria Box 21 as well as 11 compounds from the MMV Pathogen Box. 22These compounds represent a strikingly diverse range of chemotypes (Figure 1). 23A homology model of PfATP4 was developed using crystal structures from the closest mammalian homolog, a sarco/ endoplasmic reticulum Ca 2+ -ATPase (SERCA). 20However, in the absence of a solved structure of PfATP4, ideally bound to small-molecule inhibitors, it remains unclear how such a diverse range of molecules might share the same target.Indeed, a challenge to understanding such data is that structurally different molecules generating the same phenotype may be interacting with the biological target differently.
Since 2011, contributors to Open Source Malaria (OSM) have been evaluating several series of compounds originating from high-throughput screens (HTS) performed by pharmaceutical companies. 25The recent focus of OSM has been on a class of triazolopyrazine-based compounds ("Series 4") that emerged from a screen carried out at Pfizer.There are currently more than 200 compounds in Series 4, with in vitro potencies against P. falciparum ranging from single-digit nanomolar to inactive.The highly promising nature of this Figure 1.Examples of diverse chemotypes (colored) that have been linked to PfATP4.Each of the compounds gives rise to effects on the parasite's internal Na + concentration and pH that are consistent with PfATP4 inhibition. 20,21,24urnal of Medicinal Chemistry series derives from several members having been found to be effective in the in vivo mouse model of the disease. 26Based on preliminary investigations against PfATP4-resistant mutant strains (generated from the parent Dd2 strain by exposure to hits from the Malaria Box against PfATP4 21 ), Series 4 compounds are thought to target PfATP4. 27The intraseries similarity of their structures ought to imply a similarity in the way that the compounds interact with the target, but the interaction may differ from other compounds with the same phenotype.
The OSM Series 4 project is at the lead-optimization stage, with minor structural modifications being made in the search for improved solubility, potency, and metabolic clearance.As is typical in such a search, analogues are being made that possess low potency, and these represent expensive "failures" (ca.$ 2K per compound for one postdoc-week per analogue).Better predictions of compound potency would save valuable resources and accelerate the science, so a predictive model was high on the list of priorities for the OSM consortium.
For the best means to develop such a model, we maintained an open mind.Available to us was a data set of analogues with their associated activities, whether against the parasite or derived from biochemical ([Na + ], pH, and/or ATPase) assays.Many of these compounds were from OSM Series 4, and there were also candidate antimalarials from other, structurally unrelated, series.It was possible to include "presumed inactives": randomly selected molecules from commercial catalogues that were unlikely to display activity.There is obviously a rich history of QSAR-based approaches that might be called upon.A homology model (vide supra) was available that might permit a more target-based approach.Acknowledging these varied resources, we opted not to prescribe the approach to be taken and instead, in 2014, approached the scientific community simply with the need for a model that would allow us to predict the activity of hypothetical compounds.All data from OSM research projects are freely available to anyone online, representing an ideal starting point for such an open competition.
Between then and now, there has been an explosion of interest in machine learning and AI methods in drug discovery. 28,29While these new methods had the potential to be game changing, there is the ever-present challenge in this sector of hype, in the sense that the actual capabilities of some of the newer technologies, outside of marketing statements, are sometimes not clear.In OSM, the openness extends to the research process itself, allowing contributors to share what they are doing, rather than what they have done.The use of competitions to progress scientific research is not novel in itself, with previous examples of this in data analysis for drug discovery, 30 but it is uncommon for competitions to be accompanied by the next crucial step: benchmarking by chemical synthesis and biological evaluation of predicted molecules.It is rarer still for science competitions to run completely openly, where everyone can see, and potentially incorporate, other entrants' solutions as they are submitted.We felt we could achieve two things by running this competition with OSM's open source ethos, in which those submitting entries would reveal their predictions in real time and, ideally, provide full methods (within the boundaries of commercial sensitivities).We would be able to approach the scientific problem along multiple paths, but we would also be able to provide a clear case study of the current effectiveness of predictive modeling in phenotypic drug discovery.

Journal of Medicinal Chemistry
■ RESULTS AND DISCUSSION Round 0. An initial attempt by a single OSM contributor to develop a pharmacophore model was based around the known PfATP4 active compounds from the MMV Malaria Box. 31,32sing Discovery Studio from Accelrys (now BIOVIA) to process 28 active compounds with the Common Feature Pharmacophore Generation protocol, 10 four-feature models were produced.These were then narrowed down based on poses and score to one model that was developed further (Figure 2A).
The 28 active compounds were mapped to the model and a shape feature was created (Figure 2B).It was thought that this could give a general idea of the shape of the compound binding site (Figure 2C).Exclusion features were next added in areas where high scoring, inactive ligands penetrated outside of the shape figure.Unfortunately, when this model was applied in 2014 to a set of compounds that were evaluated for their ability to dysregulate ion homeostasis, the predictions were found to correlate poorly with the experimental potency results (Figure 3).The test set was selected to be structurally diverse, including features known to be associated with inactivity (e.g., transposition of the northwest pendant) but also features where minor variations were known to be important for activity (aromatic substituents in the pendant amide).It was suggested that the lack of correlation could be due to factors not being taken into account by this first model (overlapping binding sites and compound chirality); a pharmacophore model explains aspects of the geometry of the interaction but not the details of the thermodynamics of the protein−small molecule contacts.
This model was also used to screen 32 the Maybridge library of compounds 33 to identify a small and diverse selection of molecules to evaluate in biochemical assays.The results were filtered manually to give a final selection of 18 compounds that were subsequently evaluated for their effects on the parasite's cytosolic Na + concentration (at 1 μM) and pH (at 5 μM).None of the compounds were found to increase the parasite's cytosolic [Na + ] or pH, which confirmed that the model required further optimization and led to the start of a crowdsourced attempt to solve this challenge.
Round 1.The first full round of the predictive modeling competition was run between 2016 and 2017 and was intended to elicit the participation of members of the wider scientific community with expertize in computational chemistry. 34The competition adhered to the open science principles underpinning the OSM consortium.Specifically, all participants were required to work openly for the duration of the competition, with working and data posted on open Electronic Laboratory Notebooks (ELN) that were made publicly available. 35The participants were tasked with developing a predictive model using data provided by OSM that included a list of compounds with activity data for both in vitro whole cell potency and PfATP4 ion assays, 36 along with the entire data set of OSM compounds from previous series ((mostly presumed) inactives).Once the models were developed and deposited, the participants were provided with the molecular identifiers (e.g., SMILES strings) for the 400 compounds contained within the MMV Pathogen Box and were required to rank them in order of predicted activity in the ion assays.The compounds were at the same time screened for their effects on parasite cytosolic [Na + ] concentration and pH and the data Figure 3. Poor correlation was seen between the first model's predictions and experimental data.While there is an excellent correlation between in vitro parasite killing potency and the ability to dysregulate parasite ion homeostasis, the majority of the model predictions did not correlate well with the experimental data.The compounds were tested for their effects on cytosolic [Na + ] and pH in isolated parasites (Dd2 strain), at 1 and 5 μM, respectively; "Yes": it indicates that the compound gave rise to an increase in cytosolic [Na + ] and a cytosolic alkalinization similar to that seen on addition of a 50 nM concentration of the PfATP4 inhibitor cipargamin."No": it indicates that the compound did not affect the resting cytosolic [Na + ] or pH."Moderate": it indicates that the compound gave rise to an increase in cytosolic [Na + ] and pH that was less than that observed on the addition of 50 nM cipargamin.

Journal of Medicinal Chemistry
held back until the models had been submitted.A small cash prize inducement was employed to stimulate interest, despite the risk this brings of making the intrinsic reward for participation more extrinsic. 37ix diverse, fully fledged entries were submitted from individuals working in both public and private sectors, with all working shared online (Table 1). 38The submissions were reviewed by a panel of four judges (Prof.Matthew Todd, A/ Prof. Alice Motion (University of Sydney), Dr. Murray Robertson (University of Strathclyde, creator of the previous model in Round 0), and Prof. Alexander Tropsha (University of North Carolina, Chapel Hill)) who evaluated the top 20 ranked compounds from each model against the undisclosed Pathogen Box data.Two entrants developed models that were able to predict correctly two active compounds within their top 20 rankings, with a further model a close third place. 39hile this first round of the competition was successful in demonstrating the capabilities of the community to work openly and provide quality data, the models, though obtained with diverse methods, were not yet highly predictive.A possible reason for this was the dissimilarity of the structures in the OSM Series 4 data set and the contents of the MMV Pathogen Box.Of note was, again, the striking diversity of chemotypes (A−K, Table 1) sharing a target.Interestingly, opinions of the performance of the models in this round differed between laboratory chemists (who regarded the 2/20 hit rate as not being practically helpful) vs cheminformaticsbased entrants and judges (who regarded the 2/20 hit rate from a structurally diverse set of 400 compounds that was not strongly correlated with the training set, as a respectable outcome).
Round 2. Given the diverse, spontaneous inputs from the initial round of the open competition, and the high quality of the associated dialogue that had taken place on the relevant project website, GitHub, it was decided that a second round would be run in 2019 since "expensive failure analogues" were still arising in the experimental program.The aim for this round was not only to allow for the entrants from Round 1 to improve upon the original models, but for new participants to get involved with inputs from larger companies that specialized in artificial intelligence and machine learning (AI/ML) approaches.Since the series had moved on in the interim (with further compounds being evaluated), the community had access to an expanded data set, including all of the data used as the test set for the previous round. 22he competition's second round was launched in July 2019. 41In this new phase of the competition, it was the intention to use the best-performing models to perform the most important task of all: to predict new chemical matter that would be active (rather than merely look at the fit of retrospective data).Synthesis and evaluation of these predictions would then serve as model validation in a "real" case.A small, new data set of activity from recently synthesized analogues was kept back to serve as the basis for judging model fitness.
By the conclusion of Round 2 (a period of ∼10 weeks), 10 entries had been submitted, five of which were from returning participants (Table 2).In a similar fashion, submissions were The precision of each model was calculated according to: precision = x/(x + y), where x is the number of correct predictions (active and inactive combined) and y is the number of false-positive predictions. 42t was originally intended for each of the four winning entrants (first and second place winners) to generate two new structures that were predicted to be active using their models: one possessing the Series 4 triazolopyrazine core and the other being structurally distinct.This would give a total of eight molecules to be synthesized and validated experimentally.In addition to optimizing potency, model generators were tasked with keeping good solubility in mind as a design criterion.It became evident that certain suggested compounds were synthetically inaccessible or would take major resources to pursue, and these were triaged with some minor human inputs from the computational and synthetic teams; these inputs varied from team to team and typically involved selecting between the highest-scoring compounds.Synthetic tractability is often an issue when predictive models do not take into account known synthetic pathways, though there is significant

Journal of Medicinal Chemistry
activity at present to improve the incorporation of synthetic planning into library suggestion (Figure 4). 49,50he initial list was narrowed to focus on six predicted triazolopyrazine compounds (Figure 5).The six compounds were successfully synthesized and subsequently evaluated for in vitro (growth inhibition) activity against P. falciparum along with the previously reported positive control for the series. 51In addition to the standard potency (in vitro growth) assay, these compounds were evaluated for their ability to inhibit PfATP4 in biochemical (cytosolic [Na + ]) assays to confirm that the MoA had not changed following these structural changes.
Three of the six compounds were found to be active (<1 μM) or moderately active (1−2.5 μM) in in vitro growth assays with asexual blood-stage P. falciparum (3D7) parasites, representing a hit rate of 50% on a small sample size.Up to this point, a total of 398 compounds had been made and evaluated for in vitro activity in OSM Series 4, with the design of these compounds driven entirely by the intuition of medicinal chemists.By setting a potency cutoff of 2.5 μM (the upper limit of reasonable activity), the tally of active compounds discovered in this series stands at 165, representing a comparable human intuition-derived hit rate of 41% on a larger sample size.Most of the compounds were tested (blind) for their ability to disrupt cytosolic [Na + ] in isolated asexual blood-stage parasites, which confirmed an unchanged mechanism of action: two of the compounds found to be active in in vitro growth assays disrupted Na + regulation, whereas the three compounds inactive in growth assays did not, at the concentrations tested (Figure S9).
It is interesting to compare these results with the intuition of the chemists who have deep experience of this series and who are familiar with the SAR.A recurring observation was the sensitivity of the length of the ether linker between triazolopyrazine core and northwest phenyl group, with a spacer of two methylene units (between phenyl ring and oxygen) leading to far higher potencies than other lengths.The Davy Guan prediction involving the shorter linker, and the Molomics 1 prediction without the pendant phenyl ring, lies in the class of inactive compounds subject to human retrospective wisdom (i.e., the "Could Have Told You That" class).In contrast, Exscientia compounds were thought by the human team to be likely to be potent, but only one performed well (i.e., the "That's Odd" class).Finally, the Optibrium/ Intellegens suggestion that included the tert-butyl pendant was thought by the human team to be a certain inactive, given what was known of variation in that part of the molecule (where related substituents such as -OMe have been observed to perform poorly, and much time had been spent in the production of inactive variants); yet, this compound displayed good potency and is a particularly useful outcome (i.e., the "Machine Overlords" class).
To gain more insight, and to improve these potential antimalarials, further iterations of these models are needed.The open nature of competitions and of the overarching consortium is that anyone may work on improvements since everyone has access to all of the data, making this a "living" research project.A potential explanation for the predicted hit rate not being higher is the relatively small data set (∼400 compounds) from which each model was developed, potentially compromising perfectly reasonably computational approaches yet representing a fairly typical situation for lead optimization.Two further points are of particular note: (1) it was possible to involve leading experts from the private sector in an open competition to solve a public health challenge without those participants needing to compromise their competitive business advantage; indeed, success in this endeavor has already been used as an unvarnished demonstration of capabilities. 52(2) The private sector participants displayed high and sustained levels of collaborative working and commitment to a public good, in what is counter to the public's perception of the secretive nature of the modern pharmaceutical industry; indeed, the "winning" and "losing" of the competition were less important than the extent to which  S9) were experimentally validated.Three compounds were found to be active.*PfATP4 activity was not obtained for this compound.

Journal of Medicinal Chemistry
entrants worked together openly to improve the underlying research. 41

■ CONCLUSIONS
With hit identification and lead optimization being key steps in the development of any new drug, the continued advancements in machine learning and artificial intelligence approaches possess significant promise to streamline this process, which would result in more efficient medicinal chemistry campaigns.In the absence of target structural information, a crowdsourced approach was used to develop predictive models for a promising antimalarial series.Importantly, the winning models of the most recent competition round were used to generate novel compounds, which were then synthesized and evaluated for experimental validation of each model leading to a new counterintuitive "active".The simple open science and crowdsourcing principles used throughout this campaign are applicable to many medicinal chemistry projects, whereby community's combined efforts can be used to accelerate the early stages of drug discovery and involve participants from public and private sectors.The work conducted here has been designed to be "living", in that all methods and results are publicly available and contributions can continue to be made by anyone because everyone has access to all data and ideas.

■ EXPERIMENTAL SECTION
General Information.Reagents were purchased from either Sigma−Aldrich, Alfa Aesar, Acros, Merck, Fischer Scientific, Matrix Scientific, Ajax or Fluorochem.Unless otherwise specified, the reagents were used without further purification.Anhydrous solvents were obtained by drying over activated 3 Å molecular sieves.Argon gas was used as acquired.Reduced pressure means under rotary evaporation at 40 °C from 900 to 50 mbar.Flash chromatography was performed on a Biotage Selekt.Analytical thin-layer chromatography was performed on Merck Silica Gel 60 F 254 precoated aluminum plates (0.2 mm) and visualized with UV irradiation (254 nm) and potassium permanganate.High-temperature reactions were carried out in silicone oil baths, controlled by a temperature probe in the oil bath.
The purity of all evaluated compounds was >95% as determined by NMR spectroscopy (provided for all compounds evaluated biologically).4) were previously synthesized according to literature procedures. 51eneral Procedure 1: Condensation of Hydrazinylpyrazine with Aldehyde.Compound 1 (1 equiv) was dissolved in EtOH (112 mM).Aldehyde (1 equiv) was added and the reaction stirred at rt overnight.The suspension was filtered and washed with cold EtOH to give the corresponding hydrazone that was used without further purification.
General Procedure 2: Cyclization of Hydrazone to Triazolopyrazine Core.The product from General Procedure 1 (1 equiv) was dissolved in CH 2 Cl 2 (112 mM).PhI(OAc) 2 (1 equiv) was added and the reaction stirred at rt overnight.The reaction was quenched with sat.NaHCO 3 solution, diluted with CH 2 Cl 2 , and the organic layer was separated.The aqueous layer was extracted with CH 2 Cl 2 (2×) and the combined organic layers were washed with sat.NaHCO 3 solution, brine, dried (MgSO 4 ), filtered, and concentrated under reduced pressure to give the crude product, which was purified by automated flash chromatography on silica to give the corresponding triazolopyrazine core.
General Procedure 3: Reduction of Esters to Alcohols.Ester (1 equiv) was dissolved in anhydrous tetrahydrofuran (THF) (566 mM) and cooled to 0 °C.LiAlH 4 (1 M in THF, 2 equiv) was added dropwise, and the reaction mixture stirred for 10 min at 0 °C, then at rt. Upon completion, the reaction was diluted with THF and cooled to 0 °C.H 2 O (1 mL/1 g of LiAlH 4 ) was added followed by 15% aq.NaOH (1 mL/1 g of LiAlH 4 ) and H 2 O (3 mL/1 g of LiAlH 4 ).The mixture was allowed to warm to rt and stirred for 15 min.MgSO 4 was added, and the reaction mixture was filtered through a pad of celite and concentrated under reduced pressure to give the crude product, which was purified by automated flash chromatography on silica to give the corresponding alcohol.
General Procedure 4: Nucleophilic Displacement of Triazolopyrazine Core Chlorine with Alcohol.Alcohol (1.0 equiv) was added to PhMe (168 mM) along with triazolopyrazine core (1.0 equiv), KOH (3.0 equiv), and 18-crown-6 (0.1 equiv).The reaction was stirred at rt until completion as indicated by thin-layer chromatography (TLC) (100% EtOAc).The reaction was diluted with H 2 O and then extracted with EtOAc (3×).The combined organic layers were washed with H 2 O until the aqueous layer became neutral, followed by brine, dried (MgSO 4 ), filtered, and concentrated under reduced pressure to give the crude product, which was purified by automated flash chromatography on silica to give the corresponding ether-linked product.

Figure 2 .
Figure 2. Model creation workflow.(A) Four-feature pharmacophore model chosen for further development with MMV006429 mapped.(B) All 28 active compounds used in Round 0 superimposed onto the four-feature model.(C) Shape feature added based on poses in (B).(D) Inactive molecules from the data set mapped.(E) Exclusion spheres added.

Figure 4 .
Figure 4. Examples of the suggested compounds predicted by the winning entrants from Round 2.

Figure 5 .
Figure 5. Six chosen suggested compounds for experimental validation.The predictions were synthesized (see the SI) and their potencies and MoAs (FigureS9) were experimentally validated.Three compounds were found to be active.*PfATP4 activity was not obtained for this compound.

.
All IC 50 curve fitting was undertaken using XLFit version 4.2 using Model 205 with the following four parametric equation: y A B bottom, B = % inhibition at top, C = IC 50 , D = slope, x = inhibitor concentration, and y = % inhibition.If the curve did not reach 100% of inhibition, B was fixed to 100 only when at least 50% of inhibition was reached.

Table 1 .
Summary of the Results from Round 1 of the Predictive Modeling Competition a 22brary of "common" transformations as seen in CHEMBL.B runnerup aCompounds A−K shown to be active from the MMV Pathogen Box screen against PfATP4.22Journal of Medicinal Chemistry reviewed by a panel of four judges (Prof.Matthew Todd, Dr. Edwin Tse (UCL), Dr. Murray Robertson (Strathclyde), and Prof. Robert Glen (Cambridge)) who compared the predicted potencies against the experimentally derived blood-stage potency values for 34 compounds.

Table 2 .
Summary of the Results from Round 2 of the Predictive Modeling Competition a See the Supporting Information (SI) for full experimental details.b Based on regression prediction.c Based on classification prediction.