Publication Details
SoluProtMutDB: A manually curated database of protein solubility changes upon mutations
Hamšíková Marie
Štourač Jan
Musil Miloš, Ing., Ph.D. (DIFS)
Damborský Jiří, prof. Mgr., Dr. (UMEL)
Bednář David
Mazurenko Stanislav, Ph.D.
Mutational database, Protein engineering, Protein yield, Machine learning,
Protein aggregation
Protein solubility is an attractive engineering target primarily due to its
relation to yields in protein production and manufacturing. Moreover, better
knowledge of the mutational effects on protein solubility could connect several
serious human diseases with protein aggregation. However, we have limited
understanding of the protein structural determinants of solubility, and the
available data have mostly been scattered in the literature. Here, we present
SoluProtMutDB the first database containing data on protein solubility changes
upon mutations. Our database accommodates 33 000 measurements of 17 000 protein
variants in 103 different proteins. The database can serve as an essential source
of information for the researchers designing improved protein variants or those
developing machine learning tools to predict the effects of mutations on
solubility. The database comprises all the previously published solubility
datasets and thousands of new data points from recent publications, including
deep mutational scanning experiments. Moreover, it features many available
experimental conditions known to affect protein solubility. The datasets have
been manually curated with substantial corrections, improving suitability for
machine learning applications. The database is available at
loschmidt.chemi.muni.cz/soluprotmutdb.
@article{BUT180690,
author="Jan {Velecký} and Marie {Hamšíková} and Jan {Štourač} and Miloš {Musil} and Jiří {Damborský} and David {Bednář} and Stanislav {Mazurenko}",
title="SoluProtMutDB: A manually curated database of protein solubility changes upon mutations",
journal="Computational and Structural Biotechnology Journal",
year="2022",
volume="20",
number="1",
pages="6339--6347",
doi="10.1016/j.csbj.2022.11.009",
issn="2001-0370",
url="https://reader.elsevier.com/reader/sd/pii/S2001037022005025?token=47A572AECBA6EE5C334462194E5EFC9034D184A71EC41EC7B29311B976C644C5FCFBB5B1B449E84DD1A99A04BCFA8708&originRegion=eu-west-1&originCreation=20230110080940"
}