J 2021

SoluProt: prediction of soluble protein expression in Escherichia coli

HON, Jiří; Martin MARUSIAK; Tomas MARTINEK; Antonín KUNKA; Jaroslav ZENDULKA et. al.

Basic information

Original name

SoluProt: prediction of soluble protein expression in Escherichia coli

Authors

HON, Jiří (203 Czech Republic, belonging to the institution); Martin MARUSIAK (203 Czech Republic); Tomas MARTINEK (203 Czech Republic); Antonín KUNKA (203 Czech Republic, belonging to the institution); Jaroslav ZENDULKA (203 Czech Republic); David BEDNÁŘ (203 Czech Republic, belonging to the institution) and Jiří DAMBORSKÝ (203 Czech Republic, guarantor, belonging to the institution)

Edition

Bioinformatics, Oxford (UK), Oxford University Press, 2021, 1367-4803

Other information

Language

English

Type of outcome

Article in a journal

Country of publisher

United Kingdom of Great Britain and Northern Ireland

Confidentiality degree

is not subject to a state or trade secret

References:

URL

RIV identification code

RIV/00216224:14310/21:00119188

Organization

Přírodovědecká fakulta – Repository – Repository

DOI

http://dx.doi.org/10.1093/bioinformatics/btaa1102

UT WoS

000649437800004

EID Scopus

2-s2.0-85100389869

Keywords in English

SOLUBILITY; WEBSERVER; TOPOLOGY; ACCURATE

Links

EF17_043/0009632, research and development project. GJ20-15915Y, research and development project. LM2018131, research and development project. LM2018140, research and development project. 814418, interní kód Repo. 857560, interní kód Repo.
Changed: 16/2/2024 04:08, RNDr. Daniel Jakubík

Abstract

V originále

Motivation: Poor protein solubility hinders the production of many therapeutic and industrially useful proteins. Experimental efforts to increase solubility are plagued by low success rates and often reduce biological activity. Computational prediction of protein expressibility and solubility in Escherichia coli using only sequence information could reduce the cost of experimental studies by enabling prioritization of highly soluble proteins. Results: A new tool for sequence-based prediction of soluble protein expression in E.coli, SoluProt, was created using the gradient boosting machine technique with the TargetTrack database as a training set. When evaluated against a balanced independent test set derived from the NESG database, SoluProt's accuracy of 58.5% and AUC of 0.62 exceeded those of a suite of alternative solubility prediction tools. There is also evidence that it could significantly increase the success rate of experimental protein studies.
Displayed: 2/8/2025 16:04