Applescript Tutorial 9: Embedding chemical information in image files

·

·

Rich Apodaca has been discussing embedding molecular information in images of molecules, such as a PNG file depicting a 2D structure. As we move to a more web-centric view of the world it is apparent that much of research information will be only available via the web, whilst images of chemical structures are usually adequate for a human viewer the chemical structure cannot be indexed and subsequently searched. In the previous tutorial (Applescript Tutorial 8) I showed how to use applescript to extract the information from the PNG file and then display the structure in a couple of chemical display packages in an editable form.

A couple of people have asked for a tool to embed metadata into images and this script shows how to add SMILES string to a PNG file using ChemBioDraw.

This script again relies on the excellent ExifTool by Phil Harvey (http://www.sno.phy.queensu.ca/~phil/exiftool/). ExifTool is a platform-independent Perl library plus a command-line application for reading, writing and editing meta information in images. To allow the addition of custom tags you need to edit and install the exiftool configuration file.
The file and full instructions can be downloaded from the ExifTool website but you need to download the source not the MacOSX binary. To save you time you can just download the file here. The part of the file that adds the tags is shown below. This new configuration file will allow you to add metadata for SMILES, molfile and sdf to PNG files.

# The %Image::ExifTool::UserDefined hash defines new tags to be added

# to existing tables.

%Image::ExifTool::UserDefined = (

# new PNG tags are added to the PNG::TextualData table:

'Image::ExifTool::PNG::TextualData' => {

SMILES => { },

molfile => { },

sdf => { },

},

);

The downloaded file needs to be installed in the same folder as exiftool which if you installed it using the binary installer will be usr/bin. You will need to rename the file to .ExifTool_config (note the leading period!). This is easist to do using the Terminal. You will need the admin password.

sudo mv /Users/Chris/Desktop/ExifTool_config /usr/bin/.ExifTool_config

If you have installed ExifTool anywhere else you will need to install the configuration file in the appropriate place. You also need OpenBabel and the easiest way to install is to install the ChemSpotlight plugin from here

First draw the structure in Chemdraw and save it as a PNG, ignore the warning that chemical information will be lost, we are going to embed it in the metadata :-). Leave ChemBioDraw open with the structure in place.
We then tall ChemBioDraw to get the SMILES string, with a little checking to make sure a structure is selected. The next part uses OpenBabel to generate a canonical SMILES. The last part creates the shell script telling ExifTool to embed the metadata into the selected image file.

The first part of the script below simply asks the user to select the image file that you want to add chemical metaadata to, it then generates the POSIX path to the file since ExifTool requires UNIX style paths. ExifTool will add the metadata and also create a copy of the original PNG file at the same location.

You can check the metadata has been installed using the ExifTool command.

exiftool -v path to imagefile

If you install the applescript in the folder

Applications:CS ChemOffice 2008:CS ChemDraw:ChemDraw Items:


The next time you start up ChemDraw there will be a ‘Scripts” menu in the top menu bar and you will be able to access it from within ChemDraw.

The applescript to embed chemical metadata

set theFile to (choose file with prompt "Select the PNG file to add metadata:") as alias

set the_posix_file to POSIX path of theFile


tell application "CS ChemBioDraw Ultra"
	
	if not (enabled of menu item "copy") then
		do menu item "Select All" of menu "Edit"
		set the_SMILES to SMILES of selection
	else
		
		set the_SMILES to SMILES of selection
	end if
	
end tell

--use openbabel to get canonical SMILES
set the_ob_script to "echo '" & the_SMILES & "' | /usr/local/bin/babel  -ismi -osmi"
try
	
	set ob_smiles to (do shell script the_ob_script)
	
end try



--display dialog ob_smiles
set theScript to "exiftool -SMILES=\"" & ob_smiles & "\" " & the_posix_file

do shell script theScript


We obviously would want to add other file formats to the metadata but unfortunately there is a bug in ChemBioDraw such that you cannot script saving as other file formats. I'll show a way to get around this in the next tutorial.

Leave a Reply

Your email address will not be published. Required fields are marked *