Project Details
Description
Project Summary
Engineered proteins drive biotechnology and biology as therapeutics, diagnostics, and reagents. While
engineering the primary function – e.g. binding – has become relatively robust, identifying proteins that meet the
rigors of clinical and practical use remains highly problematic. Many proteins suffer from poor developability –
instability, insolubility, low expression, and non-specific binding – that ultimately limits utility. Protein sequence
space is immense, and sequence-function relationships are complex. Thus, more efficient methods are needed
to map the sequence-developability landscape and reduce the practical burden of identifying developable
sequences. Robust, quantitative knowledge of the landscape would [1] empower design of libraries constrained
to developable space, [2] enable design of mutants to rescue lead molecules with compelling primary function
but developability liabilities, and [3] enhance fundamental insight of factors that dictate protein
robustness. Efficient techniques could also [4] enable integrated, upstream library-scale selection for
developability. Sequence models are moderately predictive of select metrics but do not robustly quantify the
overall landscape. Current experimental approaches are inefficient. Thus, creation and implementation of a
platform for library-scale evaluation of protein developability would be transformative to accelerate and
streamline the protein discovery and engineering pipeline. We will pursue this objective via three specific aims.
Aim 1: Engineer a platform for library-scale evaluation of protein developability. We will develop a set of
cellular assays that couple [i] genotype-phenotype linkage, [ii] phenotypic stratification via flow cytometric sorting
or growth competition, and [iii] deep sequencing to efficiently quantify metrics of developability for millions of
protein variants thereby elevating developability characterization by orders of magnitude relative to current
methods. Aim 2: Elucidate sequence/developability landscapes for binder scaffolds. We will quantitatively
elucidate sequence-developability landscapes for three ligand scaffolds to [i] empower mutant design to rescue
lead molecules with compelling primary function but developability liabilities and [ii] to advance fundamental
understanding of the physicochemical principles that dictate protein robustness. Aim 3: Design constrained
libraries that yield significantly more developable binders. We will use this insight to design and test
constrained combinatorial libraries to yield significantly more developable binders than an unconstrained library.
We will test three hypotheses: [i] nested sampling enables the efficient traversal of the sequence/developability
landscape to identify an effective constrained library design; [ii] developable space is more evolvable than naïve
space (provided library scale diversity is maintained); and [iii] the intersection of developability and evolvability
can be effectively identified via these methods.
Status | Active |
---|---|
Effective start/end date | 9/1/22 → 5/31/24 |
Funding
- National Institute of General Medical Sciences: $338,220.00
- National Institute of General Medical Sciences: $338,220.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.