Daniel Lowe is a PhD student funded by Boehringer Ingelheim developing new tools to analyse patents. We are particularly interested in Pharmaceutical patents. The initial work has focussed on developing a name to structure program, OPSIN, which will take IUPAC names and convert these to chemical structures. OPSIN is a Java package for converting (English) chemical names to structures. It is open source and freely available from SourceForge for use as either a standalone application or library.
• OPSIN combines good recall with exceptional precision and speed of execution (as compared to tested commercial offerings) and is an open source free project.
• Being open source gives the potential for OPSIN to be extended by interested members of the community. OPSIN’s fragment dictionaries are stored as XML and can be easily edited.
• OPSIN is currently employed as the IUPAC name resolution software in OSCAR3 a tool for recognising chemical names in text. It is hoped that in the future that other text mining tools will employ OPSIN for their chemical name to structure needs. http://opsin.ch.cam.ac.uk/ is available on our website to try out.
Future work will concentrate on entity recognition and knowledge abstraction from Patents.