Abstract
AbstractIn recent years, considerable effort has been invested in the development of classification models for prospective hERG inhibitors, due to the implications of hERG blockade for cardiotoxicity and the low throughput of functional hERG assays. We present novel approaches for binary classification which seek to separate strong inhibitors (IC50<1 µM) from ‘non‐blockers′ exhibiting moderate (1–10 µM) or weak (IC50≥10 µM) inhibition, as required by the pharmaceutical industry. Our approaches are based on (discretized) 2D descriptors, selected using Winnow, with additional models generated using Random Forest (RF) and Support Vector Machines (SVMs). We compare our models to those previously developed by Thai and Ecker and by Dubus et al. The purpose of this paper is twofold: 1. To propose that our approaches (with Matthews Correlation Coefficients from 0.40 to 0.87 on truly external test sets, when extrapolation beyond the applicability domain was not evident and sufficient quantities of data were available for training) are competitive with those currently proposed in the literature. 2. To highlight key issues associated with building and assessing truly predictive models, in particular the considerable variation in model performance when training and testing on different datasets.
References
57
Referenced
35
10.1038/nature04710
10.1093/eurheartj/ehi312
- ICH Anon. ICH S7B Guideline (Step 4 Version): “The Nonclinical Evaluation of the Potential for Delayed Ventricular Repolarization (QT Interval Prolongation) By Human Pharmaceuticals” to be found under http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Safety/S7B/Step4/S7B_Guideline.pdf 2005.
10.1007/s10822-006-9095-6
10.1002/jat.1395
10.1016/j.vascn.2006.02.003
10.1002/cmdc.200700264
10.2174/157340908784533256
10.1007/s11030-009-9117-0
- H. H. Tie MD thesis University of New South Wales (Australia) 2002.
10.1038/bjp.2008.267
10.1016/S0008-6363(02)00846-5
10.1021/tx800063r
10.1002/1439-7633(20020503)3:5<455::AID-CBIC455>3.0.CO;2-L
10.1002/cmdc.200500099
10.1016/j.bmc.2008.01.017
10.1002/cmdc.201000024
10.1021/mp700124e
10.1016/j.bmcl.2005.03.080
10.1016/j.bmc.2007.06.028
10.1016/j.vascn.2004.06.003
10.1111/j.1365-2125.2008.03327.x
10.1016/j.vascn.2005.04.002
10.1021/jm060500o
- J. K. Martin D. S. Hirschberg “Small Sample Statistics for Classification Error Rates I: Error Rate Measurements” to be found under http://www.ics.uci.edu/∼dan/pub.html 1996.
{'key': 'e_1_2_9_26_2', 'first-page': '479', 'volume-title': 'Information Theory, Inference and Learning Algorithms', 'author': 'MacKay D. J. C.', 'year': '2003'}
/ Information Theory, Inference and Learning Algorithms by MacKay D. J. C. (2003){'key': 'e_1_2_9_27_2', 'first-page': '1661', 'volume-title': 'Proc. IJCNN‐2006, Vancouver, BC, Canada', 'author': 'Cawley G. C.', 'year': '2006'}
/ Proc. IJCNN‐2006, Vancouver, BC, Canada by Cawley G. C. (2006)10.1016/j.jspi.2007.06.003
10.1093/bioinformatics/16.5.412
10.1016/j.compbiolchem.2004.09.006
10.1021/ci700350n
10.1021/ci800079x
- C.‐W. Hsu C.‐C. Chang C.‐J. Lin “A Practical Guide to Support Vector Classification” to be found under http://www.csie.ntu.edu.tw/∼cjlin/papers/guide/guide.pdf 2010.
- K. Morik P. Brockhausen T. Joachims inProc. ICML‐99 Bled Slovenia 1999 pp. 268–277 to be found under http://www.cs.cornell.edu/People/tj/publications/morik_etal_99a.pdf.
{'key': 'e_1_2_9_35_2', 'first-page': '169', 'volume-title': 'Advances in Kernel Methods – Support Vector Learning', 'author': 'Joachims T.', 'year': '1999'}
/ Advances in Kernel Methods – Support Vector Learning by Joachims T. (1999)10.1023/A:1010933404324
- The R Project for Statistical Computing http://www.r‐project.org/.
10.1021/ci034160g
10.1021/ci100050t
- Python Programming Language(v.2.5.2) http://www.python.org/.
- U. M. Fayyad K. B. Irani inProc. IJCAI‐93 Chambery France 1993 pp. 1022–1027 available from http://sci2s.ugr.es/keel/pdf/algorithm/congreso/fayyad1993.pdf.
-
J. Demsar B. Zupan “Orange: From Experimental Machine Learning to Interactive Data Mining” White Paper to be found under http://www.ailab.si/orange/wp/orange.pdf 2004.
(
10.1007/978-3-540-30116-5_58
) 10.1016/S1093-3263(00)00068-1
- Chemaxon Budapest Hungary http://www.chemaxon.com.
- Pipeline Pilot Student Edition(v. 6.1.5) Accelrys San Diego CA USA http://accelrys.com/products/pipeline‐pilot/.
10.1186/1752-153X-2-5
- MOE (Molecular Operating Environment) v. 2008.10 Chemical Computing Group Montreal Canada http://www.chemcomp.com.
10.1021/ci00062a008
- M. Kubat S. Matwin inProc. ICML‐97 1997 pp. 179–186 available from http://sci2s.ugr.es/keel/pdf/algorithm/congreso/kubat97adressing.pdf.
10.1186/1471-2105-10-213
10.1021/ci050308f
10.1021/jm060379l
10.1021/jm801236n
10.2174/1568016033477432
10.1038/sj.bjp.0705336
10.1002/cmdc.200900374
10.1021/jm900002x
Dates
Type | When |
---|---|
Created | 14 years, 3 months ago (May 6, 2011, 7:35 a.m.) |
Deposited | 1 year, 10 months ago (Oct. 9, 2023, 3:18 p.m.) |
Indexed | 2 days, 17 hours ago (Aug. 31, 2025, 6:25 a.m.) |
Issued | 14 years, 3 months ago (May 6, 2011) |
Published | 14 years, 3 months ago (May 6, 2011) |
Published Online | 14 years, 3 months ago (May 6, 2011) |
Published Print | 14 years, 3 months ago (May 16, 2011) |
@article{Marchese_Robinson_2011, title={Development and Comparison of hERG Blocker Classifiers: Assessment on Different Datasets Yields Markedly Different Results}, volume={30}, ISSN={1868-1751}, url={http://dx.doi.org/10.1002/minf.201000159}, DOI={10.1002/minf.201000159}, number={5}, journal={Molecular Informatics}, publisher={Wiley}, author={Marchese Robinson, Richard L. and Glen, Robert C. and Mitchell, John B. O.}, year={2011}, month=may, pages={443–458} }