Engineering protein developability

Project: Research project

Project Details

Description

Project Summary Engineered proteins drive biotechnology and biology as therapeutics, diagnostics, and reagents. While engineering the primary function – e.g. binding – has become relatively robust, identifying proteins that meet the rigors of clinical and practical use remains highly problematic. Many proteins suffer from poor developability – instability, insolubility, low expression, and non-specific binding – that ultimately limits utility. Protein sequence space is immense, and sequence-function relationships are complex. Thus, more efficient methods are needed to map the sequence-developability landscape and reduce the practical burden of identifying developable sequences. Robust, quantitative knowledge of the landscape would [1] empower design of libraries constrained to developable space, [2] enable design of mutants to rescue lead molecules with compelling primary function but developability liabilities, and [3] enhance fundamental insight of factors that dictate protein robustness. Efficient techniques could also [4] enable integrated, upstream library-scale selection for developability. Sequence models are moderately predictive of select metrics but do not robustly quantify the overall landscape. Current experimental approaches are inefficient. Thus, creation and implementation of a platform for library-scale evaluation of protein developability would be transformative to accelerate and streamline the protein discovery and engineering pipeline. We will pursue this objective via three specific aims. Aim 1: Engineer a platform for library-scale evaluation of protein developability. We will develop a set of cellular assays that couple [i] genotype-phenotype linkage, [ii] phenotypic stratification via flow cytometric sorting or growth competition, and [iii] deep sequencing to efficiently quantify metrics of developability for millions of protein variants thereby elevating developability characterization by orders of magnitude relative to current methods. Aim 2: Elucidate sequence/developability landscapes for binder scaffolds. We will quantitatively elucidate sequence-developability landscapes for three ligand scaffolds to [i] empower mutant design to rescue lead molecules with compelling primary function but developability liabilities and [ii] to advance fundamental understanding of the physicochemical principles that dictate protein robustness. Aim 3: Design constrained libraries that yield significantly more developable binders. We will use this insight to design and test constrained combinatorial libraries to yield significantly more developable binders than an unconstrained library. We will test three hypotheses: [i] nested sampling enables the efficient traversal of the sequence/developability landscape to identify an effective constrained library design; [ii] developable space is more evolvable than naïve space (provided library scale diversity is maintained); and [iii] the intersection of developability and evolvability can be effectively identified via these methods.
StatusActive
Effective start/end date9/1/225/31/24

Funding

  • National Institute of General Medical Sciences: $338,220.00
  • National Institute of General Medical Sciences: $338,220.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.