by Dr. Alex M. Clark
The term sketching primitives refers to a collection of operations used to draw a molecular structure, using minimal input from the user, and having as much work as possible done by algorithmic inference. The sketching primitives were eventually designed, implemented and refined with the primary purpose of making chemical structure drawing effective on portable devices, but that's not where the train of thought started out.
Since this article was originally written (August 2010), the technical details have been published in the peer-reviewed
Alex M. Clark: "Basic primitives for molecular diagram sketching",
SketchEl is an open source project that I started in 2005, and worked on in bits and pieces of spare time. It is a conventional chemical structure sketcher, which really doesn't push many boundaries or user interface paradigms. The original motivation was mostly selfish, as is often the way with open source projects: I found myself needing to view or edit molecular connection tables (e.g. MDL MOL files) on a regular basis. All other things being equal, I prefer to do serious work on a Linux box whenever possible, and at the time, there were only a handful of chemistry sketcher applications that ran on Linux, and they were all too crude for my needs. Nowadays there are quite a few, but half a decade is a long time for software. Several no-fee packages were available for Windows back then (e.g. ISIS/Draw and ACD/Sketch), but using them involved getting up and logging onto a different computer in a different room.
When that got sufficiently inconvenient, I started writing my own, and posting updates on SourceForge. The project is still actively maintained, and can be found at: sketchel.sourceforge.net. The name "SketchEl" is short for "Sketching of Elements". Or something. All the good names are taken, and I never managed to think of a better one.
The SketchEl application has served well for my own needs, and has provided a vehicle for experimentation with a few ideas which made their way into the project. One of these is the idea of a "datasheet editor". One of the things that surprises me about the cheminformatics industry is that despite the ubiquity of the MDL SD-file format, which is the defacto standard for collecting together multiple molecular structures and associating them with various textual or numeric information, there are surprisingly few ways to edit these files. In fact, it would seem that for ad hoc use, the most popular method is to use Microsoft Excel, and paste in structures as embedded graphics from a variety of sources, and maybe cobble together some macros and plugins to accomplish the task at hand.
It is not too hard to write a piece of software to view an MDL SD file, if one already has access to a rendering library for the molecular structures. But writing an editor is a task requiring far greater effort. The obvious way to go about building an editor for relatively small tables of molecules and data (small being on the order of thousands of entries, rather than millions) is to construct a spreadsheet-like interface, where rows and columns intersect to form cells. There are many software libraries available to take care of the basics of viewing and navigating such a construct. To this must be added a way to render the molecules within individual cells, and to edit the molecules.
It turns out that the editing of molecules is the sticking point, because it just doesn't work very well with the current paradigms for sketching. Nonetheless, I implemented a molecular spreadsheet editor for SketchEl, which looks something like this:
Editing character data works just like in Excel, i.e. if you start typing, your keystrokes will affect the content of the currently selected cell. But what to do with initiating the editing of a molecule? The only viable option is to open a separate editing window. But that is quite unwieldy. The window has to be closed when finished. What if the user clicks on the spreadsheet, and maybe selects a different cell? What if the window just gets lost behind something else? Because the window looks the same as when editing a molecule that isn't linked to a cell inside a datasheet, maybe the two states should be interconvertible? And so on. It works fine, but it would be much nicer if the editing was inline, carried out within the cell itself, in the same window as the spreadsheet view.
The idea of putting the sketcher inline is tempting, but it brings up lot of issues. For a start, the popular user interface paradigms for sketchers are quite wasteful of screen real-estate, and SketchEl is no exception. These ideas were worked out when it was assumed that it is acceptable to take up most of the screen and claim plenty of room for menus and toolbars. This would mean that the size of the cell being edited would have to be expanded to take up as much space as necessary to be able to display the molecular structure in enough detail to be able to click and drag on any part of it. Furthermore, the sketcher has its own menu bar, and so does the spreadsheet view. Having a window with two menu bars is clearly not a good idea, so they would have to be merged. The same thing would have to be done for keyboard shortcuts. With enough careful design, no doubt this could be accomplished well enough.
But when one thinks of a molecule cell being as similar as possible to a cell containing text or numbers, the reasons for various distinctions are remembered: most of the keyboard is used to edit the content; the mouse, or special keys that are not involved in text entry, are used to activate commands. Hotkeys and mouse gesture paradigms have been designed so that they do not interfere with normal typing, an idea which goes back to the early days of graphical wordprocessors.
So what if the molecule editing could all be done by using keys which are normally used for entering text and moving the text cursor around? What about having a cursor which traverses around atoms and bonds in a molecule, and binding the QWERTY keys to specific actions for modifying the molecule... could that make an effective editor?
If all of the necessary functionality for drawing and editing molecules could be expressed using this limited user input set, it would be possible to build an inline molecular structure editor. Editing a molecule might need a bit more screen space than merely viewing it, but not as much as a normal sketcher. And the editing process could be as orthogonal to the hotkeys and menu items of the spreadsheet view as are the non-molecular cells.
As intriguing as I find this idea, it might not be ready for prime time. Almost every chemist who has ever needed to present a document which involves a picture of a molecule is familiar with the standard sketcher paradigm, and probably already owns a copy of ChemDraw. Learning an arcane set of keyboard commands just to avoid a popup window probably seems like a step back to the 1980s, when software was often packaged with cue cards that listed the keyboard commands, to be placed beside the monitor.
So perhaps the idea of a well designed set of sketching primitives that can be used to draw a complex molecular diagram with a minimum number of keystrokes and cursor navigations may not have much market appeal for desktop software, but what about some other arena?
While not an avid fan of ultraportable devices myself, I read a fair amount of computer news, and an increasing proportion of headlines over the last couple of years have been talking about iPhones, BlackBerries, Android and a whole lot of other portable devices with their own operating systems and APIs.
When one looks at a classic BlackBerry with its trackpad/trackwheel and keyboard, after thinking about how to design an inline structure editor for a molecular spreadsheet, it looks like the entire device is composed from this very same constraint. Putting aside the idea of the spreadsheet for just a moment, designing a structure editor using keyboard shortcuts and direction cursor navigation is the only way to make a structure editor on such a device. If such a system could be designed, and successfully implemented for the BlackBerry, then it would be the first and only chemical structure editor for the device.
Both a challenge and an opportunity. What more would anybody need to quit the day job and start a new company?
That is more or less the story of how Molecular Materials Informatics, Inc. was conceived. It took 3 months to work out which of the sketching primitives were necessary to construct a fully featured chemical structure diagram editor, and implement them using the BlackBerry API, and test them thoroughly to ensure that using a trackpad and a keyboard is a viable way to draw, view and edit datasheets full of molecular structures and associated data, and to make it intuitive with a reasonably shallow learning curve. After the BlackBerry product was finished, the same ideas were ported to the iPhone platform. While the sketching primitives use the same algorithms for both platforms, the iPhone has only a touch screen and no keyboard, so the user interface needed to be significantly reimagined in order to make the same functionality available to a very different style of device.
Both the BlackBerry and iPhone versions of the Mobile Molecular DataSheet are available on their respective company stores. You can decide for yourself how well these ideas work, now that they have found themselves a home.