Diversity Filtering: Antibody Discovery Loses Diversity Long Before Affinity Maturation Begins - A Significant Cause of Downstream Pipeline Attrition

Diversity Filtering: Antibody Discovery Loses Diversity Long Before Affinity Maturation Begins - A Significant Cause of Downstream Pipeline Attrition

Sequence clustering, ELISA endpoints, and FACS sort gates compress thousands of candidate sequences to fewer than 200, before any rate constant is measured.

Every antibody discovery campaign starts with diversity. Whether that comes from an immune response after immunization or from screening a large synthetic or naive library, the starting pool contains hundreds to thousands of distinct sequences targeting multiple epitopes, carrying different frameworks, and spanning a range of affinities. Most of that diversity gets discarded before a single kinetic measurement is made.

This post covers how that diversity is lost across the stages of a traditional antibody discovery workflow, the assay and throughput limitations that drive those losses, and how the resulting attrition in sequence space connects to pre-clinical and clinical failures downstream.

Traditional antibody discovery workflows filter diversity at two stages before any kinetic measurement is made: sequence clustering, expression scoring, and ELISA thresholds reduce the initial pool to 20–100 clones, and yeast display wash steps then deplete fast off-rate binders during affinity maturation. The SPOC platform measures 1,152-2,304 variants before any downselection occurs, returning KD, kon, koff, polyreactivity, titer, and Fab assembly per variant. Selection is based on complete kinetic and biophysical data across a diverse hit pool rather than enrichment scores from a compressed one.

The Initial Downselection

Regardless of the discovery route, the starting pool after an initial screen is large. Immunization campaigns sequence hundreds to several thousand B-cell receptors after bleed and B-cell isolation. Phage or yeast display campaigns against synthetic or naive libraries produce enriched sequence pools of comparable size after panning rounds. In both cases, distinct clonotypes, multiple epitope bins, varied frameworks, and a spread of predicted affinities are all present at this stage.

What moves forward is a fraction of that. Depending on the campaign, 20 to 100 clones are selected for expression, purification, and initial binding confirmation based on sequence clustering, germline proximity as a developability proxy, and expression scores from display or transient transfection.

Early downselection is largely driven by enrichment score-based binning. Only the highest-affinity bins move forward. Moderate affinity binders never get a second look. Fast off-rate binders that would pass a more sensitive assay get binned with true non-binders if their ELISA signal falls below threshold (see our earlier blog post for more information on this). Frameworks that express moderately but carry useful paratope geometry are cut in favor of high-expressors. The downselection reflects the limits of the assay at that throughput, not the actual diversity of the starting pool.

The 20 to 100 clones that survive carry a compressed version of the original sequence diversity. Some epitope bins are over-represented because those antibodies happened to express cleanly and sort cleanly, while others drop out of the pool entirely.

Affinity Maturation Narrows Further

The survivors enter affinity maturation, most commonly by yeast display, using error-prone PCR or site-directed mutagenesis on the CDRs. A typical yeast display library spans 10⁷ to 10⁹ variants. Sorting by antigen concentration across multiple rounds selects for higher apparent affinity.

First, yeast display selection depletes fast off-rate binders. Wash steps between staining and FACS measurement allow antigen to dissociate. Variants with fast koff lose signal before the sort gate is applied and sort into negative fractions, even if their kon is high and their equilibrium KD is tight. The output of yeast display affinity maturation is enriched for slow koff variants, independent of whether slow koff is the desired property for the therapeutic indication.

Second, affinity maturation is applied to the clones that entered it. If a clonotype was dropped during the initial downselection, no amount of library construction recovers it. The sequence space explored during maturation is bounded by the input sequences. A framework absent from the 20 to 100 lead clones will not appear in the matured output.

After multiple sort rounds, the surviving pool has been filtered twice: once at the clonotype level, and once at the kinetic level within each clonotype. The variants that reach SPR confirmation are a small, non-representative sample of the original starting pool.

What the Attrition Costs

Biologics fail in the clinic at high rates. But it is less known that high attrition occurs in preclinical phase as well, where leads fail to reach preclinical endpoints, before reaching IND filing. Off-target binding, poor pharmacokinetics, and suboptimal potency each contribute. Many of these pre-clinical and clinical failures trace back to properties never measured during discovery: koff-dependent tissue retention, polyreactivity against human proteins, or developability issues that only surface when a molecule is reformatted from scFv to Fab or IgG.

The sequence that would have addressed those problems may have been present in the original B-cell repertoire or the synthetic library. It was cut during the initial downselection or lost during affinity maturation, not because it was inferior, but because the assays at those stages could not distinguish it from the variants that were dropped. Early downselection based solely on enrichment scores and ELISA endpoints is not a biology problem. It is a measurement problem.

Testing 1,152–2,304 Variants Before the Downselection Determines the Outcome

SPOC platform synthesizes antibodies directly from DNA sequences on SPR biosensor chips using cell-free in vitro transcription and translation. The platform requires no cloning, expression, or purification steps. A single chip runs full SPR kinetics on 1,152–2,304 variants in parallel, delivering kon, koff, KD, Rmax, and t₁/₂ per variant in three weeks.

Whether the starting point is an immunization campaign or a phage and yeast display screen, a campaign can send the top 50 to 200 initial hit sequences to HyperKinetiKx for binding validation with full kinetics before any clone is discarded. Alternatively, 1,000 to 2,000 variants from a phage or yeast display campaign can route directly to SPOC without a prior downselection step. Each sequence receives a quantitative kinetic profile covering fast and slow off-rate binders alike. Moderate affinity hits that would have been binned out by a sort gate are retained in the dataset. Using the initial pool, epitope binning can be performed on the same chip to select a diverse set of binders for the subsequent functional assays.

From that validated hit pool, rational design or AI-driven CDR mutagenesis libraries are constructed across several distinct clonotypes rather than a single lead series. 1,152–2,304 variants spanning multiple frameworks, epitope families, and CDR mutation combinations run on a single HyperSynaptiKx chip. Alternately this could be expanded into a fully degenerate yeast display library, to then downselect few thousand best performers for SPOC on-chip analysis. Following SPOC analysis, each variant returns antigen binding kinetics alongside polyreactivity data, titer, and Fab assembly confirmation in the same assay. Early downselection can be based on epitope specificity, polyreactivity profiles, construct stability, and kinetics together, rather than enrichment scores alone.

Retaining Diversity at the Hit Stage Changes Downstream Risk

A candidate portfolio selected with full kinetics across multiple clonotypes, frameworks, and epitope bins carries different properties than one selected by display enrichment and ELISA endpoint alone. The probability that at least one candidate carries a koff appropriate for the therapeutic target, a polyreactivity profile consistent with clinical advancement, and a framework that reformats cleanly goes up accordingly.

Affinity maturation can still follow for the top candidates. But it starts from a better-characterized, more diverse input set, and the decisions about which clonotypes to mature are made with kinetic data rather than display enrichment scores.

Testing 1,152–2,304 variants before the downselection is finalized keeps more of that diversity in play long enough to make an informed choice, and reduces the probability that the sequence most suited for clinical development was dropped in round one.

For more information on the SPOC platform, get in touch:

Contact Us

1600 Adams Drive

Suite 236



Menlo Park, CA 94025

7201 E Henkel Way

Suite 285



Scottsdale, AZ 85255

480-219-9506

Privacy & Conditions

All rights reserved © 2024

For more information on the SPOC platform, get in touch:

Contact Us

1600 Adams Drive

Suite 236



Menlo Park, CA 94025

7201 E Henkel Way

Suite 285



Scottsdale, AZ 85255

480-219-9506

Privacy & Conditions

All rights reserved © 2024