Monte Carlo models of drug R&D focus attention on cutting costs – Part 1

December 10, 2013 by admin | 13 Comments

There is a theme behind many of DrugBaron’s musings over the last three years: pharma R&D is just too expensive to make economic sense. Given high failure rates throughout the process, including in particular a significant rate of late stage failures when the capital at risk is very high, either attrition must fall or costs must come down.

Almost everyone in the industry recognizes this equation. But for most, particularly those who are guardians of large (and expensive) R&D infrastructure, it has been more palatable to talk of improving success rates than decreasing costs.

What cost cutting there has been has been quantitatively and qualitatively wrong. Pruning a few percentage points off R&D budgets that have tripled in just a little over a decade has no discernible impact on the overall economics of drug discovery and development. And cutting costs by reducing the number of projects, rather than reducing the cost per project, is not only ineffective but counter-productive as DrugBaron has already noted, on more than one occasion.

But there is a fundamental tension in the equation: success rates are assumed to be heavily tied to expenditure. If you spend less per project, attrition rates will go up (assuming at least a proportion of the money is being wisely spent) and you will not improve the overall economics. You might even make it worse.

So what makes DrugBaron so confident that dramatically cutting the cost per project makes sense? That even if decision quality declines slightly, it will be offset by a greater gain in productivity?

The “evidence” comes from sophisticated computer simulations of early stage drug development that underpin the ‘asset-centric’ investment model at Index Ventures. Models that have remained unpublished – until now.

Drug development is a stochastic process. That much is indisputable, given the level of failure. Processes that we understand and control fail rarely, if ever. But such is the complexity of biology that even the parts we think we understand relatively well still conceal secrets that can de-rail a drug development program at the last hurdle.

The fundamental premise of drug discovery and development is therefore one of sequential de-risking. Each activity costs money and removes risk, so that the final step (usually substantial pivotal clinical trials that test whether a drug safely treats a particular disease) is positive as often as possible.

Exactly how often this last step IS positive is open to some debate. A figure often cited for the phase 3 success rate is 50%. But this headline figure masks considerably heterogeneity. For example, once a drug has been shown to be effective, it is usually entered into huge programs of Phase 3 trials to provide information to support its competitive position and to expand its label, both around the initial indication and into other related indications. Such trials are fundamentally more likely to be positive than a first phase 3 trial of any agent. Since agents that score failures in their first couple of phase 3 trials are likely to be scratched, there is a substantial bias in favour of positive trials in the dataset as a whole.

Equally importantly, a considerable fraction of all active drug development programs can be characterized as “me too” or “me better” – in other words, modulating a target that has already been validated in earlier successful phase 3 trials (albeit with a different agent). This eliminates most of the risk arising from the sheer complexity of biology, which remains the hardest risk to discharge in drug development. Once again, therefore, such trials are fundamentally more likely to be positive than a first phase 3 trial of a truly “first in class” agent.

DrugBaron’s own count-back over the last five years suggests the success rate for these first phase 3 studies with agents targeting a previously unproven mechanism of action is somewhat less than 25% (unfortunately, such analysis still requires a degree of subjectivity in assigning trials into each category).

Irrespective of the precise number, the point is clear: despite best efforts to de-risk late stage trials, the majority of the risk is still there until the very end. The drug discovery and development process, therefore, more closely resembles weather forecasting than engineering. The contribution of stochastic processes (things which are either random or simply too complex to be properly understood at the present time) is significant – and ignored at your peril.

The take-home message is black and white: reducing cost per project is the most effective way to increase drug R&D productivity – even if it slightly damages the quality of decision-making

The time-honoured modeling algorithm for such stochastic processes is the Monte Carlo simulation. This is a method that relies on repeated random sampling to obtain numerical estimates (in other words running simulations many times over in order to calculate those same probabilities heuristically just like actually playing and recording your results in a real casino situation: hence the name).

Our model examines a set of ‘projects’ with defined characteristics (explained below) taken from concept through to phase 2a clinical proof-of-concept. The output assigns a cost to each action taken in each project, and attributes a value to each positive clinical proof of concept study. While our intention was to model an early stage venture fund (with a portfolio of such projects), the lessons apply equally to any institution running multiple drug discovery and early development projects, such as pharma companies. There is nothing in the model that intrinsically relates to a venture fund.

The key to the model is the concept of the ‘action-decision chain’ – the principle that even something as complex as drug discovery and development can be reduced to a sequence of actions (experiments that generate data) followed by a decision as to whether to kill or continue the project to the next stage. It doesn’t much matter what sort of data is envisaged (ADME, toxicology, efficacy, CMC, clinical and so forth), the process is entirely generic – you collect the data, examine it and make your decision.

Each action, in the model, has only one parameter: cost. At each stage, you can spend a little or a lot (which presumably affects the amount and quality of the data that you obtain). The model doesn’t attempt to characterize what the individual steps are, but merely assumes that each successive step is more costly than the last one (which, if you are doing your drug development right it should be – you should discharge the cheapest risks first).

The way each decision point is modeled is critical to the output. The model has an internal flag (set at the beginning of each run) as to whether that agent ‘works’ or not. This is the real world situation: the day you decide to develop a particular molecule for a particular indication, the dice has been rolled: it either does work in that indication or not – its just that you won’t know for many years, and many millions of dollars!

At each decision point, then, a filter is applied that has a false positive rate (that is, the data looks fine, so the management continue even though actually the project is doomed to eventually fail) and a false negative rate (where the management kill a project that would, had they continued, actually been successful). If you set all the decision filters to have perfect quality (no false negatives and no false positives) then only successful projects are progressed and productivity is maximized. But to model the real world, you can introduce imperfect decision-making at different levels (and with different emphasis on false positives versus false negatives – a parameter we call ‘stringency’ of decision making: complete stringency would mean that all projects are killed; 100% false negative rate but 0% false positive rate).

The computer then runs all the projects, and calculates the total amount spent on all the actions across all the projects (and, of course, as soon as a project is ‘killed’ it ceases to progress to the next, more expensive, action). At the end, when every live program has read out its phase 2a proof of clinical concept, the model sums up the value that has been created. The output is a return on capital invested.

Of course, because of the random element modeled in the decision-making (the non-zero false negative and false-positive rates), every time the simulation is run on a collection of projects, with all the parameters set the same, there will be a different global outcome (the return on capital will vary, depending by chance how often the decision filters ‘got it right’ in that run).

As a result, to get an estimate of how “good” that set of parameters really are, the simulation is run a hundred times on a hundred different sets of projects. This allows the mean return of capital if you operating under those conditions of cost and decision quality to be estimated (as well, interestingly, as the standard deviation of the returns between runs).

And so to the results!

The base conditions for the model assumed 10% of the projects were inherently ‘working’ when setting the hidden flag. Of these 50% of the failures were eliminated as obvious (simulating the process by which a venture investor or pharma committee screens candidate projects and decides whether to initiate them or not). Thereafter, the remainder began a series of action-decision steps, with €1m spent prior into initiating development, €5m spent on formal preclinical and phase 1 and €10m spent on Phase 2 proof of concept. The actual sums spent don’t really matter because the output depends on the arbitrarily assigned value of a successful proof-of-concept (its only the ratio of the value created to the amount spent which is interesting). If you are a pharma company, rather than a cash-conscious asset-centric investment fund, you may want to add a zero to each of the above.

In the base model, the filter was set with a 33% false-positive rate (so that ‘only’ 66% of the real failures are stopped at each stage), and a 10% false-negative rate (wrongly stopping 10% of the real successes).

At the end, each of the real successes (based on the hidden flag) that are still alive is attributed a 50% chance of being sold for €50m (simulating an exit for the venture fund – if you were modeling the same processes in a pharma company, for example, you may select a different estimate of output value, or more likely, choose to carry the simulation beyond Phase 2 proof of clinical concept).

With these parameters, the median fund returns 0.77 of the capital invested, with fully 63% of such funds losing money. Only 3% of funds returned more than 1.5x invested capital. That may be weaker than real-world performance – it is certainly much worse than the Index Ventures track record – but the arbitrary selection of exit values means that the absolute returns are not what is interesting. The real interest comes from examining the relative impact of changing the model parameters. How does altering the cost of R&D, or the quality and stringency of decision-making affect returns?

Obviously, if you run the simulation with a slightly higher quality of asset in the initial pool, returns are increased (doubling asset quality, so that 20% of the assets have their hidden flag marked as successful, increases median returns but only to 1.1 fold). Returns, then, are not principally determined by how many pearls there are in the swamp.

By contrast, cutting costs translate linearly into improved returns. Halving the amount paid for each action-decision pair, yields a median return of 1.58 fold, with fully one third of funds returning over 2x (compared with only 3% passing this threshold in the base model).

Plotting return against the relative cost for each action-decision pair reveals two interesting phenomena: first, the gain in returns is slightly better than linear across the whole cost range, with particularly spectacular benefits when the cost becomes much smaller than the average value of the successful exit. Second, and less intuitively, the standard deviation of returns across a hundred iterations of the same fund parameters declines as cost goes down. In other words, not only are median returns increasing, but the chances of getting a return close to median is also increased (which should comfort real-world investors who have only one, or a small number, of funds to worry about at any given moment).

More subtle models, varying not the total expenditure but when it is spent (in other words the gradient by which costs increase with each sequential action-decision pair) show that keeping early spending low is the critical parameter for improving returns. This makes sense: at the beginning, the team is operating in the least data-rich environment, so making decisions virtually in the dark has a large element of random chance. As data accumulates, the ability to make decisions improves (and, assuming sequential de-risking has been taking place, the average quality of the asset pool still alive is also increasing). As DrugBaron has noted before, the important thing is that capital at risk is graded as steeply as possible from low to high as a project progresses.

So much for costs. What about decision quality?

The parameters that control the decision filter (false-negative rate and false-positive rate) can be tweaked in two different ways: they can be altered so that the decision quality improves (that is, that more decisions are objectively correct compared to the hidden flag) or that the stringency increases (for the same quality of decision making, the decision is more likely to be a kill).

Strikingly, a four-fold improvement in filter quality (achieved by halving the rate of both the false-negative and false-positive parameters) had only a marginal benefit on returns (median 0.86 fold versus 0.77 in the base model). In fact, once the decision quality reaches a point where at least 2 out of 3 decisions are objectively correct, then returns hardly increase at all beyond that point.

In the part of the curve where we operate in the real world (with something of a majority of correct decisions), the model tells us that returns are very insensitive to further improvements in the decision-making quality. The reason for this is simple: because early decisions have to be made on the basis of very little data (the model, like real-world early stage drug developers, is operating in what Daniel Kahneman called a ‘low validity environment’) random chance is as important as the ability to make decisions based on the data that has been revealed up until that point.

Now for the critical ‘experiment’: tying together the quality of the filter with the amount paid of each action (simulating the widely-held view that the more we spend on drug discovery the better the dataset we accumulate on which to make decisions). And the answer? Assuming the decision quality is at least 60% correct decisions, then it NEVER pays to spend more in order to increase the quality of the decision filter. In other words, the productivity gain from the improved decision filter is more than offset by the productivity loss from the increase in capital at risk.

Of course, like any simulation, these mathematical Monte Carlo models are an imperfect surrogate for the complex world of drug discovery and development. The simplifications made are manifold (not least that all types of data provide the same contribution to decision quality per dollar they cost to obtain, which is unlikely to be true). Moreover, the very structure of the model enshrines the principle of “working in the dark” at the start of the process and gradually gaining more visibility on the ‘real’ outcome as the process continues – in other words the idea that drug development as a process has substantial stochastic component.

At every stage of the drug development process there is more that you don’t know than you do know – ignore that at your peril

But once you accept the stochastic component is material (which current failure rates, even in phase 3, surely support), then the lessons these models teach us are likely to have some value, even if they should not be literally translated into a rigid new framework for drug development. And those lessons are simple: spend as little as possible on as many unrelated projects as possible; apply a highly stringent filter, but do not pay ‘extra’ to try and improve the quality of the decision filter; and above all focus on reducing costs per project.

If that sounds familiar, that is no accident. The insights from these models underpin the “kill the losers” strategy at the heart of asset-centric investing (as well as the antipathy towards a “pick the winners” strategy). As real-world evidence starts to accumulate that supports the predictions from the model – as the recent paper from The Boston Consulting Group in Nature Biotechnology certainly does – then our confidence in putting these lessons into practice will only grow.

Cutting costs in pharma R&D comes with a caveat, however: read it here in Part 2

Kelvin Stott

Great article, David! I have been running my own, similar Monte Carlo simulations for a while
now, and I have come to similar conclusions, which may be
counter-intuitive and send shivers down the spines of many drug
developers who fear losing perfectly good drug candidates that are
already very rare. The fact remains, however, that it may be better to
lose a few good drug candidates in order to bring down the spiralling
costs of drug development, and thus improve overall R&D
productivity.

Keep up the good work, and please do check out (and join in!) the discussion on this article at our LinkedIn group:
- davidgrainger
  
  Thanks Kelvin
  
  You are correct to point out the importance of stringency as well as cost-cutting. Because of the huge capital cost of late-stage failures and the apparently inexhaustible supply of exciting new biological understanding, the drug development model is much more tolerant of false negatives (killing things that would have worked) than false-positives (keeping things alive that might work).
  
  The essential point, when it comes to stringency, is to ignore the sunk cost fallacy. Dont even ask how much capital and other resources have been invested in a project up to today. Ask only whether the risk left in the project today is compatible with the TOTAL costs required to reach the end (approval and market success). If in doubt, kill!
  - Kelvin Stott
    
    Absolutely, I couldn’t with you agree more: It takes real courage to kill a false negative, but stupidity to gamble with a false positive!
    
    To be honest, every one of your blogs has resonated with me so strongly, every pharma R&D exec (and CEO!) should follow it very closely!
Pingback: To Save Pharma R&D, David Grainger Says Drug Developers Must Think Like CEOs Of Lean Startups | VantageWire
Pingback: To Save Pharma R&D, David Grainger Says Drug Developers Must Think Like CEOs Of Lean Startups | news.phar.me
strehan

David – does this model take into account the fact that spending more money may actually lead to enhanced decision-making (better false negative/positive rates) by uncovering additional data (whether its target activity, anti-tumor activity, etc.)?
- davidgrainger
  
  Spending and decision quality are independent parameters (since it is not certain that there is a link) but we ran models (described in the fourth last para next to the summary table) in which we “bought” better decision-making in exactly the way you describe. And the answer was clear: the decline in ROI from the extra spending always outweighed the gain from the improved decision filters. This is because ROI decreases at least linearly with cost across the whole range, while ROI hardly increases at all, even with large improvements in decision-quality once a majority of decisions are correct (reflecting the fact that, at early stages in development, outcome has as much to do with chance as good decision making,
John Alan Tucker

David, I’ve been quite intrigued by this study and by the issue. Thanks for sharing it.

I tried to work through the model and I could not do it. As I understood:

1) In the base case, 2/3 of “bad” compounds and 10% of “good” compounds are eliminated in preclinical / phase 1, and the same thing happens in Phase 2. Starting with 50 compounds (of which 10 are good), at the end of Phase 1 you have 9 good and 13 bad compounds for a phase transition rate of 22/50 = 44%. You have spent 50 x 5 million Euros = 250 million Euros

2) These 22 compounds go through Phase 2 POC at a total cost of 22 x 10 million = 220 million Euros. You have roughly 12 survivors at this point and have spent a total of 470 million Euros.

2) Of these 12 survivors, 6 are written off and 6 are licensed for 50 million Euros each, for a gross of 300 million Euros. You are 170 million Euros in the hole.

Apparently I’ve misunderstood something, but it seems that in the “average” case with these inputs, drug discovery is not a money making operation. Where did I misunderstand?

I guess the main question I have aside from this is where the 25 million Euro probability adjusted value of a success comes from. Publicly traded companies with a modestly differentiated post-phase 2 compound would almost always have a enterprise value north of 100 million Euros, and 150 million would not be all that uncommon. Obviously an increase in the value of a success will tilt the field in favor of greater expenditure in support of not losing winners.

I’ve noted this before, but cannot resist the temptation to note the I don’t think the Boston Consulting Group paper was spot on. It is very difficult for database companies to track preclinical projects, and I think undercounting seriously biases their results.
- davidgrainger
  
  There are three steps – €1m, €5m and €10m spend in seed phase preclinical, formal preclinical and phase 1 and in Phase 2a respectively. So the filters are applied 3 times not twice. This improves the outcome because more losers are killed at only €1m cost. There are 80 assets not 50 in each pool, and 50% of the losers are eliminated before you begin spending money.
  
  In the average case then, after the ‘pre-filter’ there are 8 winners and 36 losers alive. These 44 have €1m spent on them. 2/3 losers are killed leaving 10 losers alive as well as 1/10 winners: so 7 winners alive.
  
  17 x €5m now spent (€85m – €129m cumulative total), and the filters re-applied. So now 4 losers alive and 6 (ish) winners. Spend €10m on these 10 projects for a total cumulative spend of €229m.
  
  Of the six winners, 3 are licensed for €50m – so €150m in. The ratio of €150m to €229m (0.66) is not far from the average ROI in the simulation using these conditions (0.77). The difference is likely because the returns are so “lumpy” even 100 iterations is an imperfect estimate (and my calculations above killed more good projects to keep to integer answers – 7 is less than 9/10ths of 8; 6 is less than 9/10ths of 7) again making the explicit ratio lower than the simulation ROI).
  
  But all in all, it confirms the model is working as you would expect.
  
  To your other points:
  
  The asset valuation at the end probably is a bit low – both with too high a write off and too low a transaction value. The numbers you quote, though, probably have some selection bias in them too. There are plenty of lower value Ph2 transactions whose upfronts are not made public for various reasons.
  
  In the end, however, it makes little difference what value you assign if your interest is in looking at how changes to the parameters affect the outcome. Thats why the ROIs in the summary table in the post are normalized to the return in the base model. You can make them all better or worse in proportion by changing the terminal asset values but will no lessons learnt.
  
  So you cant say from these models whether drug discovery and early development is a money-making or loss-making enterprise “on the average” – that was never the purpose of them. It would be pointless trying to make them so, since other assumptions in them are so “coarse” as to render the output irrelevant in absolute terms (for example, we dont only apply ‘kill/continue’ decisions 3 times with millions of Euro spent between them!). The power of the models is in relative comparisons, changing the model parameters.
  
  Lastly, I agree the limitations of the BCG analysis (and any other database-driven analysis) are significant – just as the assumptions in our models are significant. It is like deducing conclusions about strategy in thick fog – but as we dont have better tools we have to refine strategy using the information that is available. And do so cautiously, aware of the low-validity environment we are working in (just like early-stage drug discovery itself!)
  - John Alan Tucker
    
    Thank you David, this was an interesting exercise.
    
    I guess the point I was trying to get at is that it seems that the validity of your central conclusion would be sensitive to the both your assumed cost structure and your upside. Reducing to the absurd, if the cost of preclinical, phase 1, and phase 2
Kelvin

Hi David, for such a simple model with only a few steps and decisions with discrete outcomes, you don’t even need a Monte Carlo simulation: You can get an accurate expected outcome (without the noise of random sampling) by modeling a probability decision tree directly. The only reason I have been using Monte Carlo simulations myself is because I have been modeling the impact of uncertainty in all assumptions (as continuous variables) simultaneously. This gives a more complete picture of the overall risk (uncertainty), but the expected (average) outcome (ROI) is the same.
- davidgrainger
  
  Agreed. For the implementations discussed in this post, you can explicitly calculate the average returns. But you can do more complex things with the model than shown here – the purpose of the article was really to highlight the conclusions.
http://vishrasayan.blogspot.in/ Murali Apparaju

Thanks a lot for sharing this.

You have mentioned that while this model has been developed primarily for applying to 1) the early stage venture fund spread across a portfolio of (single asset?) organizations by Index Ventures, it is equally applicable to 2) the drug discovery program with multiple pipeline candidates of a big/ mid pharma company.

At the end of my first read though, the model appeared a lot more suited to the drug-discovery program of a big-pharma!

My above surmise is based on how the ‘costs’ mentioned (at each stage) came across as the ‘total cost’ for advancing that stage & not a ‘portion of the total-cost’ that a typical venture firm would’ve typically funded, assuming there’d be other venture partners participating in each funding round at that very decision stage…. – My question hence is;

- Is the expectation that the model will be run by the lead VC based on ‘total cost’ at each stage of the asset in question? If yes, how easy or tough it is to make the venture partners to get on one-line while making decisions on what programs to kill & what to keep?

My other question is if this model will make more sense if applied specific to each therapeutic segment than a potpourri of indications? If yes, how easy or tough it is for a single VC or a group of Venture partners to agree & disagree on the decisions?

M	T	W	T	F	S	S
« Aug
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28