An example of implementing an MMDS webservice in PHP is provided, with source code.

Overview

Requirements

Framework

List

Specification

Invocation

Utilities

Summary

Overview


The webservice protocol used to communicate between the MMDS mobile client app and an HTTP-based service is quite straightforward to implement as a server application. A number of web server technologies can be used to implement custom services, such as PHP, Java Servlets or ASP.

An MMDS webservice typically provides chemical information (structures, data, annotations) in response to a query which typically also contains chemical information. Examples of services that fit well into the model include structure searching of databases, catalogs or registration systems; property prediction; structure-activity data submission; and any number of other utility services or transformative functions.

There are two main circumstances for implementing an MMDS-compatible webservice:

  1. The webservice acts as a pass-through to some other service, and so is essentially a repackaging mechanism for existing functionality.

  2. The chemical information service is performed by the server process itself.

The first case often does not require any special cheminformatics libraries, and so the choice of server environment has fewer restrictions. The second case requires that the service be implemented in an environment where all of the necessary functionality immediately available, which is more restrictive. For example, implementing a property prediction service using Java-based cheminformatics libraries would likely make more sense as a Java Servlet than non-Java based environment.

This example demonstrates a PHP implementation of a webservice which matches the first case. The example used is the ChEBI structure search service. These have been selected because PHP is popular, easy to learn, and concise; and wrapping the native ChEBI webservice is also simple and concise because it uses a clean and unobfuscated WSDL/SOAP interface.

Requirements


An MMDS webservice consists of a simple HTTP request, which is either a GET or POST type. POST requests must transmit an XML document. All request types return an XML document. More details are provided in the webservice protocol description.

The example used in this article executes within the PHP 5 environment, using libraries which are typically available. With some work, the service could be ported to an earlier version, or a different serverside scripting environment.

Framework


The main entrypoint can be summarised in the following block of code:

<?php

$APPNAME="ChEBI_Example";
$DESCRIPTION="Structure searching within the ChEBI online database.";

$cmd=getURLCommand();
if ($cmd=="list") generateList();
else if ($cmd=="spec") generateSpec();
else if ($cmd=="invoke") invokeApp($GLOBALS['HTTP_RAW_POST_DATA']);
else return; // error state

...

?>

Additional functions will be described in the following sections.

The variables $APPNAME and $DESCRIPTION are used for clarity. The value of $cmd is assigned to the query part of the request URL. For example, a request to http://acme.com/ChEBI.php?list would derive the value of $cmd to be "list". The getURLCommand utility function is described later.

There are just three valid possibilities for $cmd: "list", "spec" and "invoke", and each of them is delegated to the appropriate function. For practical purposes it is a good idea to test for when $cmd is an empty string, and print out some helpful HTML output. For debugging, it is also a good idea to add an additional option, such as "debug", which invokes the service with test parameters. These extensions are not described in this example.

List


In order to function autonomously, the webservice must respond to the list command. This is a simple XML document which contains information about one-or-more services:

function generateList()
{
    global $APPNAME,$DESCRIPTION;

    header("Content-type: text/xml");
    $url=getBaseURL();

    echo
'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<MMDS_WebApps>
<App name="'.$APPNAME.'">
<Description>'.$DESCRIPTION.'</Description>
<URL>'.$url.'</URL>
</App>
</MMDS_WebApps>';
}

Three pieces of information are necessary to describe a service: name, description and URL. The resulting XML document can contain any number of <App> nodes, so it is possible to have a separate service which federates a number of other services. In this example, however, the service only describes itself.

The getBASEURL utility function is described later. It is necessary to use absolute URLs to describe the location of each service in the list. In this case, the absolute URL for the current PHP script is the desired value.

Specification


The specification for a webservice contains three ingredients: name, description and fields. The name and description are redundant with the information provided by the list command. The fields part is an XML tree which has a list of the data that the user is prompted to provide before executing the webservice. The webservice protocol description provides a detailed explanation of all the field types.

The generateSpec() function, shown below, demonstrates the information needed to execute a structure search via ChEBI: molecular structure, search type and the maximum number of results:

function generateSpec()
{
    global $APPNAME,$DESCRIPTION;

    $fields=
'
<structure type="molecule">
 <Title>Structure:</Title>
 <DefaultVal/>
 <Format>MDLMOL</Format>
</structure>

<searchtype type="option">
 <Title>Search Type:</Title>
 <DefaultVal>Exact</DefaultVal>
 <Options>
  <O>Exact</O>
  <O>Substructure</O>
  <O>Similar95%</O>
  <O>Similar90%</O>
  <O>Similar80%</O>
  <O>Similar50%</O>
  </Options>
</searchtype>

<resultlimit type="number">
 <Title>Maximum Results:</Title>
 <DefaultVal>20</DefaultVal>
 <MinVal>1</MinVal>
 <MaxVal>1000</MaxVal>
 <NumDecimals>0</NumDecimals>
</resultlimit>
';
	
    generateRawSpec($APPNAME,$DESCRIPTION,$fields);
}

function generateRawSpec($appname,$description,$fields)
{
    header("Content-type: text/xml");

    echo
'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<MMDS_WebSpec>
<Name>'.$appname.'</Name>
<Description>'.$description.'</Description>
<Fields>'.$fields.'</Fields>
</MMDS_WebSpec>';
}

The result is essentially a hardcoded block of XML.

Invocation


When the webservice is invoked, it means that the user has filled in the fields and transmitted the data as a POST request, which must be an XML document.

The first task for the invokeApp function is to convert the input data into an XML-DOM object, and extract out the parameters, which is shown in the following code block:

function invokeApp($input)
{
    $xml=simplexml_load_string($input);

    if ($xml->getName()!="MMDS_WebQuery")
    {
        generateErrorResult("Invalid query.");
        return;
    }

    $paramList=$xml->xpath("Parameters");
    if (count($paramList)==0)
    {
        generateErrorResult("Query has no <Parameters> tag.");
        return;
    }

    $paramList=$paramList[0];
    $structure=$paramList->structure;
    $searchtype=$paramList->searchtype;
    $resultlimit=$paramList->resultlimit;

    ...

If an error is detected, the generateErrorResult function (described later) is called, which generates a return value that gracefully passes an error message back to the MMDS client.

The three input values are held in the variables $structure, $searchtype and $resultlimit. Some transformation is required to convert these into identifiers used by the ChEBI webservice:

    ...

    $chebitype='IDENTITY';
    $chebisim=0.5;
    if ($searchtype=='Substructure') $chebitype='SUBSTRUCTURE';
    else if ($searchtype=='Similar95%') {$chebitype='SIMILARITY'; $chebisim=0.95;}
    else if ($searchtype=='Similar90%') {$chebitype='SIMILARITY'; $chebisim=0.90;}
    else if ($searchtype=='Similar80%') {$chebitype='SIMILARITY'; $chebisim=0.80;}
    else if ($searchtype=='Similar50%') {$chebitype='SIMILARITY'; $chebisim=0.50;}

    if (!$resultlimit || $resultlimit<1) $resultlimit=5;
    if ($resultlimit>1000) $resultlimit=1000;

    ...

Now that the parameters are in the right form, it is time to connect to the ChEBI WSDL/SOAP webservice, and call its getStructureSearch function.

    ...

    $soap=new SoapClient("http://www.ebi.ac.uk/webservices/chebi/2.0/webservice?wsdl");
    $req=array
    (
        'structure'=>$structure,
        'type'=>'MOLFILE',
        'structureSearchCategory'=>$chebitype,
        'totalResults'=>$resultlimit,
        'tanimotoCutoff'=>$chebisim
    ); 
    
    try 
    {
        $result=$soap->__call("getStructureSearch",array($req));
        $result=$result->return->ListElement;
    } 
    catch (SoapFault $fault)
    {
        generateErrorResult("Error contacting ChEBI: ".$fault->faultstring);
        return;
    }

    if (is_null($result))
    {
        generateErrorResult("The query did not match any structures.");
        return;
    }
    if (!is_array($result)) $result=array($result);

    ...

If all goes well, this function will return a list of objects, each of which contains a ChEBI ID code, which refers to a single structure entity within its database. It is necessary to iterate over this list and, for each entry, make a separate request to obtain further information about the structure.

Since the returned data from the webservice is in the form of an MDL SD file, the contents are conveniently appended to a variable $sdfile as the iteration proceeds:

    ...
    
    
    $sdfile="";

    $BASE_LINK="http://www.ebi.ac.uk/chebi/searchId.do?chebiId=";
    
    try 
    {
        for ($n=0;$n<count($result);$n++)
        {
            $resultID=$result[$n]->chebiId;
            $data=fetchCompoundData($soap,$resultID);
            
            $resultStruct=$data[0];
            $resultName=$data[1];
            
            if (strlen($resultStruct)==0) continue;
            
            // special case for first structure
            if (strlen($sdfile)==0)
            {
                $lines=preg_split('/\\r?\\n/',$resultStruct);
                $lines[2]="\$"."title=ChEBI Query: ".$searchtype;
                for ($i=0;$i<count($lines);$i++) 
                {
                    if ($i>=3 && strlen($lines[$i])==0) continue;
                    $sdfile.=$lines[$i]."\n";
                }
            }
            else $sdfile.=$resultStruct;
            
            if (strlen($resultName)==0) $resultName="?";
            
            // add the data fields
            $sdfile.="> <Name>\n".$resultName."\n\n";
            $sdfile.="> <Link>\n".$BASE_LINK.$resultID."\n\n";
            $sdfile.="$$$$\n";
        }
    }
    catch (SoapFault $fault)
    {
        generateErrorResult("Error obtaining result: ".$fault->faultstring);
        return;
    }
    
    generateDataResult($sdfile);
}

For each entry, the function fetchCompoundData is called (see below).

Three pieces of information are made available for each compound: structure, name and URL to access all the rest of the molecular data for the entry.

Note the condition in the middle of the above code block, annotated by special case for first structure: this is a non-standard extension to the MDL SD file format to make up for the fact that SD files do not have any way to store a title for the overall document. The third line of an MDL MOL structure is not used by MMDS, and so it can be used to encode the title as meta-data:

$title={title of datasheet...}

Other than this awkwardness, assembling an SD file from the available data is straightforward. The results are packaged with the generateDataResult utility function, which is described later.

Acquiring specific information about an individual structure from ChEBI, based on the available ID code, is simple, and can be done by reusing the SoapClient object that was created to perform the original structure search:

function fetchCompoundData($soap,$id)
{
    $req=array('chebiId'=>$id); 
    
    $result=$soap->__call("getCompleteEntity",array($req));
    $result=$result->return;

    $name=$result->chebiAsciiName;
    $structure=$result->ChemicalStructures;
    if (is_array($structure)) $structure=$structure[0];
    $structure=$structure->structure;
    
    if (substr($structure,strlen($structure)-1,1)!="\n") $structure.="\n";
    
    return array($structure,$name); 
}

Utilities


The getBaseURL function puts together an absolute URL base for the current location:

function getBaseURL()
{
    $root='http';
    if ($_SERVER["HTTPS"]=="on") $root.="s";
    $root.="://".$_SERVER["SERVER_NAME"];
    if ($_SERVER["SERVER_PORT"]!="80") $root.=":"+$_SERVER["SERVER_PORT"];
    $uri=$_SERVER["REQUEST_URI"];
    $qmark=strpos($uri,"?");
    if ($qmark !== false) $uri=substr($uri,0,$qmark);
    return $root.$uri;
}

The getURLCommand function extracts out the command part of the URL request:

function getURLCommand()
{
    $uri=$_SERVER["REQUEST_URI"];
    $qmark=strpos($uri,"?");
    if ($qmark<0) return "";
    return substr($uri,$qmark+1);
}

The generateErrorResult function generates XML output which the MMDS client will recognise as an error state. The message will be displayed to the user:

function generateErrorResult($errmsg)
{
    header("Content-type: text/xml");
    
    echo
'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<MMDS_WebResults>
<Errors>
<E><![CDATA['.$errmsg.']]></E>
</Errors>
</MMDS_WebResults>';
}

The generateDataResult function is called when the result has been successfully obtained and packaged up as an MDL SD file:

function generateDataResult($sdfile)
{
    header("Content-type: text/xml");

    echo
'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<MMDS_WebResults>
<Errors/>
<Results>
<MDLSDF><![CDATA['.$sdfile.']]></MDLSDF>
</Results>
</MMDS_WebResults>';
}

Summary


A working example of an MMDS-compatible webservice, written in PHP, has been described. The source code from this article can be put together to make a functioning service.

For more assistance with constructing a webservice, see the contact details. The services currently hosted on the http://molmatinf.com domain are available by request.

See Also


MolSync Remote Procedure Calls, Property Calculations via Open Notebook Science, Searching ChEBI (iPhone), Searching PubChem (BlackBerry), WebServices Protocol