On Gordian pt.2

Thoughts on in vivo perturb-seq

It’s not clear to me why there aren’t more in vivo perturb-seq companies. It’s difficult to do, sure, but there must be more to it. Here, I break down my thoughts about in vivo perturb-seq as a technique, as well as Gordian as a whole. Ultimately, I’ll try to justify why I think it’s still a massively underutilized tool.

FYI: Gordian was essentially an in vivo perturb-seq company. I say “essentially” just for the sticklers that say it’s only perturb-seq if it’s with CRISPR.

In vivo perturb-seq is conceptually simple

All you really need to do is take +100 unique DNA or guide RNA sequences, slap it into a plasmid with or without a Cas, package it into a delivery vehicle like an AAV or LNP, put it all into a diseased animal while targeting a specific organ, take out that organ, slap that organ onto a sequencer, then analyse the data. If you do it enough times, you would then build a computational model and compose “features” of disease and healthy conditions from the transcriptional states you observe across the hundreds of mini experiments you run in vivo. It’s even simpler if you do it in vitro, which is what most people do, especially in the cancer field.

In vivo perturb-seq is most useful for complex, polygenic diseases with no known master genes that significantly drive disease progression. It’s often called an “unbiased” approach because you can pool tens to hundreds of relatively random drugs or “unique interventions” at once and just put it in a living system to see what happens.

In academia, perturb-seq is great because you can “perturb” or change one to three genes in a new cell line you fiddled with, see what happens, do some qPCR and N=3 in vivo confirmation, and you get a publication.

In industry, if you do perturb-seq enough times, sometimes you can find a real drug that has a therapeutic effect, identifying targets that would otherwise never be found. This is the single largest appeal and justification for Aviv Regev’s team at Genentech.

But when you look around the landscape of biotechs, very few companies are doing in vivo perturb-seq. There are largely three reasons:

1. Talent

In vivo perturb-seq usually requires relatively bespoke pipelines with custom primers that are hard to engineer from the ground up. Even with AI today, there are still relatively few people that can go beyond just a Seurat tutorial to analyse completely new, complex perturbation data and draw meaningful results from it. These experts are the “technologist” scientist types that are not experts in any particular disease but are very good at developing tools and running the pipelines. These people are generally found in computationally focused labs like the Rahul Satija, Sarah Teichmann, Jay Shendure, and Fabian Theis lab, just to name a few.

Although fields like cancer were one of the earliest to be cross-pollinated by these technologists, this hasn’t yet happened in fields like heart or lung disease, for example. So at some level, the lack of in vivo perturb-seq is a talent matching problem. Not enough technologists are being brought in by forward-looking disease experts to apply the technique to complex, polygenic diseases.

2. Delivery

When you’re testing your drug in vitro, you can just douse your new drug straight onto the plate of cells that you care about. You can mostly ignore that most intravenously delivered drugs end up in the liver. You rarely need a delivery vehicle like an LNP or AAV. It’s very easy to titrate the right dose because you can run 6 conditions in one small plate and check the results a few hours later. You can’t iterate as quickly in vivo, where each experiment requires several animals per condition to control for variability too.

3. In vivo features

Science becomes much harder when you go in vivo. Organ cross-talk? A competent immune system?? Hormones??? Why complicate your experiments for that sweet publication when you can just test your new CRISPR guide in an immortalized HEK cell line instead.

In vivo creates risk that’s often difficult to predict. I call this “real biology” but it’s often called “noise”. For example, livers are often hard to perturb over a long period of time because the cells that are perturbed turn over very quickly. Hepatocytes, the main cell type in the liver, are a rapidly dying and regenerating cell type that don’t last very long in the body. So it’s quite difficult to perturb it with a gene therapy long term, especially if you’re targeting a disease like MASH that requires a genetic intervention to regenerate the organ or maintain health for longer than the lifespan of the cell.

There’s a reason why most drugs that work in vitro don’t actually work in vivo. This is especially the case with AAVs, for example, where AAV tropism is almost impossible to gauge accurately in vitro. Even if you slap a mouse chondrocyte onto a dish and see that AAV81 transduces the cell in vitro, I can almost guarantee that it won’t work in vivo. It’s almost an AAV community joke to ask what serotype was used after every in vivo experiment presentation. The meta-point here is that in vivo is a coupled system, so the perturbation you think you’re applying is not always the perturbation the tissue actually experiences.

Few diseases make economic sense for in vivo perturb-seq

From a target economics perspective, it often doesn’t make sense for biotechs to put that much upfront capital in target discovery when validation is really what takes a long time. Setting up the platform alone could take +$10M and 3-5yrs. And that’s when you start looking for targets, not validating them.

This business model also essentially excludes all non-deeptech investors. But even for a true deeptech investor, this stretches out the timelines for a liquidation event to the 12-15yr mark, assuming no acquisition. So one of the best ways to make this worthwhile is by finding a sufficiently large market (TAM) where the investment is worthwhile.

The most recent one is obesity, which many scientists would’ve laughed at 15 years ago, but now we’re seeing lots of target discovery partnerships going around. But besides that, one of the best ways to look is by looking for diseases with the highest DALYs to low NIH (US) funding. In addition, these diseases also need be sufficiently biologically complex/polygenic and affect enough people where a drug, if developed, can address a total >$10B market, and alleviate around an order of magnitude more in societal burden. That leaves diseases like:

  • COPD
  • Osteoarthritis
  • Major depression
  • Menopause (and related conditions)
  • Chronic pain conditions (i.e. endometriosis)

These are attractive indications not just because they’re big, but because they’re polygenic, mechanistically heterogeneous, and target discovery is plausibly the rate limiter. In theory, the use of this big, expensive hammer could be justified if we could find and hit an equally large, expensive nail. There are a few of them, but overall these diseases need additional tools to find new drugs effectively.

It’s unclear if that requires these companies to develop that full tech-stack themselves, if a series of clever partnerships can enable this, or they need more FRC-style orgs baked-in. But if the recent biotech downturn taught us anything, it’s that the companies that are profitable along the way are the most resistant to macro headwinds. So I think the winning model is modular: a platform company that owns the perturbation and readout stack, partnering with companies for contracts and cashflow, all the while moving towards their North Star. Building on top of the existing biotech-pharma model, there are a few clever routes that I think haven’t been explored yet. 

Closing thoughts:

I generally believe in vivo perturb-seq is still an incredibly under-utilised method. It’s long been proven in principle, but has barely been industrialised outside a narrow set of contexts. The bespoke-ness still gives it a competitive edge, and keeps the intrinsic value of the technology quite high, even if people like me yap about it online. I could even go into much more detail of how it works and tacit knowledge of the method, but it still wouldn’t dilute the value because it’s incredibly difficult to execute as well.

Perturb-seq does not replace validation. It gives you higher-quality starting points: targets that move coherent in vivo programs rather than just in vitro phenotypes, which should reduce false starts downstream. In vivo perturb-seq platforms are expensive to set up, but prices continue to fall across the spectrum to enable more and more bespoke pipelines. That shift makes partnership-driven models more viable, and it creates a realistic wedge: a small number of giant, heterogeneous diseases where target discovery is plausibly the rate limiter and a single real hit justifies the entire stack.

What’s left now is finding great models that recapitulate the complex biology of the disease.