OSCAR and OSCAR-CheBI

Principal Investigator: 
Dr Peter Murray-Rust

OSCAR is our system for analysing and interpretating chemical text. OSCAR can locate chemical entities in a wide range of documents (theses, patents, articles, etc.) and indicate the likelihood of any term (single or multi-token) being chemical. Examples are "hydrochloric acid", "testosterone", "THF", "He", etc. OSCAR uses a variety of tagging and machine-learning techniques to analyse tokens and can, for example, indicate different likelihoods in "He used He". With its sister program OPSIN (which understands over 90% of systematic chemical names) we can interpret or lookup almost all compounds likely to be found in the literature. In the example shown OSCAR has recognised the chemical terms (green) and shows the structure of one of them.

OSCAR is widely used in the biosciences and we have recently been funded by EPSRC through the OMII software group to work with the European Bioinformatics Institute on analysing chemical in abstracts and helping to maintain their CheBI collection of chemical terms.

 

 

Summary
Date: 
Feb 2010 - Aug 2010
Research group: 
Murray-Rust group
Members: 
Jim Downing