Trends in Oncology

Reducing Clinical Attrition: Why Stronger Data Needs to be the Starting Point for Oncology R&D
Clinical attrition has been oncology’s oldest problem and, in many ways, still its biggest. The pattern is painfully familiar. A promising therapy emerges with encouraging preclinical data, advances through IND-enabling studies, shows early signals of activity in Phase I, and then fails in Phase II or Phase III. The financial costs of these failures are staggering, billions of dollars are lost globally each year. But the greater cost is measured in time and opportunity, years of development work invested, only to leave patients still waiting for new therapies. Despite decades of innovation, attrition rates in oncology haven’t shifted as much as the industry hoped. Better trial design and precision medicine strategies have helped in some areas, but the fundamental problem remains: the data we use to make early decisions often doesn’t capture the full reality of patient biology. Decoding the Cell Surface to Accelerate Discovery Why attrition remains so stubborn To understand why attrition persists, it’s worth looking at the foundation. Much of oncology R&D still relies on models and datasets that, while powerful, were never meant to carry the full burden of translational decision-making. Genomics is a prime example. Sequencing technologies have revolutionized how we classify tumors and identify potential targets. But tumors are not defined by their mutations alone. Transcriptional programs, proteomic signaling networks, post-translational modifications, and dynamic adaptations under treatment all contribute to how a tumor grows, evades therapy, and eventually resists intervention. A therapeutic strategy built solely on genetic alterations may miss the downstream biology that ultimately determines clinical outcome. Cell lines are another example. They are convenient, reproducible, and cost-effective, which is why they remain a staple of preclinical research. But they lack the heterogeneity and clinical context of patient tumors. They rarely reflect the complexity of pretreated, metastatic disease — exactly the patient populations that new oncology drugs are tested in. When early models don’t reflect the biology of the intended clinical population, it is not surprising that translation breaks down. Even when multi-omic data is available, it is often sparse, fragmented, or drawn from public repositories that were never built for translational research. These datasets may be useful for generating hypotheses, but they are rarely robust enough to support critical go/no-go decisions. And yet, in the absence of better resources, they are often asked to do just that. The gap between data and patients The result of this reliance on incomplete models is a gap between what we believe about a therapy and what happens when it is tested in patients. That gap is where attrition lives. It’s the difference between a drug that looks compelling in preclinical settings and one that can’t demonstrate sufficient efficacy or durability in the clinic. One concrete example comes from RNA and protein data. In acute myeloid leukemia (AML), large-scale analyses have shown that only about 17% of genes have a positive correlation between RNA expression and protein expression. That means if you are relying on transcriptomics alone to predict biology, you’re often looking at signals that don’t translate to the level where drugs actually act. This divergence isn’t unique to AML — it’s a reminder that single-omic views can give an incomplete or even misleading picture of tumor biology. Another example is in resistance biology. In pretreated patient-derived xenografts (PDX), resistance pathways are often “baked in” from the start, reflecting real-world clinical histories. These mechanisms are invisible in naïve cell lines, which haven’t experienced therapy. By working with tumors that already carry resistance features, researchers can anticipate escape mechanisms before they derail late-stage trials. What better data could look like If we accept that the root of the problem lies in the misalignment between early data and patient biology, then the question becomes: what would better data look like? First, it would need to come from models that are closer to the clinic. Patient-derived tumors, especially those from pretreated and metastatic populations, preserve the genetic complexity, phenotypic heterogeneity, and resistance mechanisms that cell lines cannot replicate. Studying these tumors allows us to see not just what cancer looks like in theory, but how it behaves in practice. Second, it would need to move beyond genomics into multi-omic depth. Genes matter, but so do the transcripts they produce, the proteins they encode, the phosphorylation states that regulate those proteins, and the cell surface markers that mediate interactions with the immune system or targeted therapies. Each of these layers adds context. And critically, each reveals discrepancies that can’t be seen in isolation. Take cell surface proteomics as an example. Traditional workflows for mapping the “surfacome” are plagued by noise and misclassification, which can lead to wasted effort on false targets. By capturing both plasma membrane and intracellular fractions, newer approaches provide cleaner enrichment and reduce false positives. The result is surface protein datasets that can actually be used to prioritize antibody, ADC, or CAR-T targets with confidence. That’s not a small improvement — it’s the difference between pursuing targets that work in patients and chasing dead ends. Third, it would need to incorporate functional context. Static descriptions of tumors, no matter how deep, tell us what’s there, but not how the tumor behaves under pressure. Functional assays that perturb tumors directly, whether through gene knockdowns or compound exposure, provide causal insights that correlation alone cannot. They show us how pathways respond, how resistance emerges, and how biology adapts. For example, siRNA knockdown studies in 3D PDX models can reveal dependencies that aren’t obvious from genomics alone. When combined with high-resolution transcriptomic profiling (what we call FunctionalSeq), these experiments identify pathways that are not only present but functionally essential. That’s the kind of information that can distinguish a biomarker from a true therapeutic target. What this means for pharma decision-making For pharma R&D leaders, the implications of this kind of data are significant. Instead of evaluating a candidate on a narrow slice of biology, you can assess it in the context of real patient tumors, profiled across multiple dimensions. You can compare across cohorts, understand potential resistance pathways earlier, and align therapeutic strategies with the biology most likely to be encountered in the clinic. Consider the decision to advance an asset into IND-enabling studies. In many organizations, this call is based primarily on genomic alignment, preliminary efficacy signals, and a limited view of resistance. Adding multi-omic and functional data changes the conversation. It allows teams to say, “Yes, the target is present at the DNA level, but the protein expression isn’t concordant,” or, “The mechanism looks strong in cell lines, but resistance emerges rapidly in pretreated PDX.” These insights don’t just inform science — they directly affect which assets receive investment and how development strategies are shaped. A future with fewer blind spots Attrition will always be a risk in oncology. Biology is unpredictable, and even the most carefully designed program may fail in the clinic. But the scale of today’s attrition, and the cost it imposes does not have to be inevitable. By aligning our early data more closely with patient reality, we can reduce blind spots, strengthen translational confidence, and make smarter decisions about which programs deserve to move forward. For pharma leaders, the payoff is not just fewer late-stage failures. It’s a more rational, efficient, and patient-centered pipeline. And for patients, it’s a better chance that the therapies entering trials are the ones with the greatest likelihood of success. That is the promise of stronger data and the reason it should be the starting point for oncology R&D. This isn’t just data. It’s a foundation for discovery.