DataNaut
     Company | Approach | Services | Careers | Contact | Sitemap | Home     
Services
Articles & Whitepapers
The best way to understand what we do is to learn what we’ve done for other businesses and how we did it.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19  contents | back | next


Communicating on the Grid

As with any distributed system a critical component is the underlying communications infrastructure used to shuttle data between the client and back-end services. A researcher can send hundreds of megabytes of data to the grid for processing so this was a concern from the start. The answer was a barebones protocol consisting of a transport layer, job control semantics and calculation data semantics. In addition to the protocol a proxy was needed to serve as a gateway to the grid.

Microarray Gene Expression Markup Language (MAGE-ML) was adapted to govern the exchange microarray data between TIGR MEV and the grid. MAGE is an emerging standard used to describe and communicate information about microarray-based experiments. MAGE is based on XML and can describe microarray designs, microarray manufacturing information, microarray experiment setup and execution information, gene expression data and data analysis results. MAGE does an excellent job describing microarray data however the grid does not need most of the information MAGE defines. The TIGR MEV grid works with data in terms of math, floating point numbers, not biology treating data sets as vectors and matrixes. Extracting and converting large amounts of data from a MAGE data set required too much time and resources on the grid therefore the TIGR MEV client application uses a subset of MAGE reducing conversion time.

The job control layer of the protocol provides the necessary semantics to initiate a request, stopping a request in progress and provide notifications about execution process. This was implemented using XML. MAGE packets are Base64 encoded and inserted into a job-control packet.

The transport layer of the protocol provides the mechanism for transferring job control packets up to 100Mb in size between client and server. HTTP was selected as the underlying transport because of its flexibility and industry support.

In conjunction with the protocol a communications gateway was developed to serve as a proxy between the TIGR MEV client and grid. All clients send data through the gateway thus never directly communicate with the grid. The gateway speaks the TIGR MEV protocol and knows how to handle each job preparing it for execution and shuttling the results and status back to the client. Isolating the grid in this manner has its advantages but it can be argued that a using a gateway is bottleneck. This was a concern but testing indicates that the gateway will support the workload of the TIGR research team.

The TIGR MEV Communications Gateway is implemented as a servlet running on Apache Tomcat 3.2 Servlet Engine. The gateway leverages the HTTP session mechanism provided by the servlet engine to implement asynchronous HTTP communications between the gateway and client. The client sends a job to the gateway then polls for a result via HTTP. After receiving a job the gateway spawns a PVM process (analysis algorithm) and redirects the job to it. The PVM process parses the request, performs calculations and sends the result back to the gateway. The client receives the result from the gateway during the next polling request.

Page 15 of 19 contents | back | next



TIGR MEV is an open source bioinformatics system used for computational microarray analysis. Portions of this software were developed by DataNaut Inc.; however, all rights and title in and to this software are owned and retained by The Institute for Genomic Research. If you are interested in obtaining the software visit the TIGR web site.

DataNaut provides software development consulting services with extensive expertise with microarray technologies. Organizations that are interested in using DataNaut consulting services or having TIGR MEV customized for specific research applications can send email to info@datanaut.com.

     Company | Approach | Services | Careers | Contact | Sitemap | Home   © 2012 Datanaut, Inc. All Rights Reserved.