Core Concepts
Treating drug SMILES as text sentences and applying basic NLP methods can lead to competitive scores in drug classification tasks.
Stats
Complex chemical structures defined by SMILES strings used in machine learning-based research.
Experimental results show competitive scores treating drug SMILES as text sentences.
Dataset has classes like dermatologic, antiinfective, antineoplastic, etc.
Top performing model was AtomPair+MLP with accuracy of 0.799.
Quotes
"We pose a single question: What if we treat drug SMILES as conventional sentences?"
"Our experiments affirm the possibility with very competitive scores."
"3-gram models achieve around 73.7% accuracy and 76.4% precision."