Incorrect pKa values have slipped into chemical databases and could distort drug design

Zwitterion

Confusion in zwitterionic compounds leads to misrepresented data in widely used repositories such as ChEMBL

Inconsistencies have been uncovered in how acid dissociation constants for zwitterionic compounds are recorded in chemical databases, as well as how they are used in modelling.1 This could have a significant impact on areas like drug design or environmental chemistry where pKa values play a crucial role. ’We found that the ChEMBL database, one of the largest data repositories for biochemicals – and frequently used as a data corpus for training pKa models – includes many incorrect pKa values due to this nomenclature issue,’ says Jonathan Zheng from the Massachusetts Institute of Technology who participated in the study.