QUT Home
MQUTeR Home About Us Research  

Smart Tools for Bioinformatics

 
MQUTeR
About Us
Staff
Contacts
Press Releases
Sponsors and Collaborators
Publications
Research
Tools
Bioinformatics
Overview
BioPatML
Biomashups
Bio2RDF
SilverGene
SilverMap
Workflow
People
Videos
Publications
Links
Sensor Networks
Parallelism
Mobile Computing
Biomashups

Bioinformatics is blessed with an abundance of software tools. In 2006, over 1000 molecular biology servers were identified. At its most basic, a scientific workflow can be that of a simple ‘cut-and-paste’ process, taking the resulting information of a query from one website or database and pasting it into another site for further analysis. Many researchers also leverage bioinformatics tools as components in pipelines or workflows, using scripting solutions or scientific workflow engines to allow rapid investigation of larger and more complex problems.

Such approaches have a cost associated with them. Cut-and-paste approaches are inefficient and more complex approaches require programming knowledge or the installation and configuration of complex software. Such issues act as a barrier to rapid, curiosity-driven exploration. Although scientific workflows can be shared though sites such as MyExperiment, re-use remains inconvenient, and many researchers settle for a local re-invention to address the specific bioinformatics problem of interest.


Biomashups and Mashup Frameworks

An alternative approach to bioinformatics analysis is to make use of web mashups. Mashups rapidly combine data and services from multiple sources within a single web based application. Mashups therefore offer a means for the integration of various bioinformatics tools, allowing a specific task to be performed efficiently. Although mashups can be ad hoc, limited in scope and eminently disposable, if properly developed, the model can handle more substantial computational experiments and support a culture of sharing and re-use.

A number of mashup frameworks are available: Google Mashups, Yahoo Pipes and Microsoft Popfly and serious, mashup-based bioinformatics is now possible in any of these frameworks. The Popfly framework is based around a basic service unit termed a block, which, once created, can be shared with other users. These blocks can be combined to form a mashup through a simple drag-and-drop mechanism, with connections being formed between the relevant blocks as required. Popfly therefore appears particularly convenient as a vehicle for re-use as, once developed, these blocks and mashups can be shared with the bioinformatics community at large. Regardless of the framework, uptake of serious biomashups requires an effort to support a community-based component library, and to make the broader scientific community aware of the capabilities of the approach.

Examples of Biomashups

A number of biomashups, each related to some annotation task, have been created. The mashup examples highlight the block composition model in Popfly, and are intended as an initial seed of a broader group of components. Optimal performance is achieved when a mashup is viewed in Internet Explorer.

  • Genbank to Pubmed - Retrieving articles from the Pubmed database that reference a user-specified GenBank Accession Id. This mashup is immediately useful to a researcher who has a specific gene of interest and wants to find all known information about the gene, and as a component of a more elaborate annotation mashup.
  • SDLM1 - This demo is based on a fairly elaborate undergraduate teaching exercise from QUT, in which the student is given a protein coding nucleotide sequence, and is asked successively to determine the gene family and to perform an NCBI Entrez search to obtain more information about the gene. UPDATE: Due to a recent data reconfiguration of the main Bio2RDF server, the last phase of this mashup is currently unavailable.
Click here to view larger image.
  • Protein Characteristics Mashup 1 - This mashup takes a Uniprot Protein ID as input and retrieves information about the protein’s known characteristics including protein name, sequence, journal articles, and features of interest. It also retrieves information about the protein from the OMIM, GO, PROSITE, PRINTS and KEGG databases.
  • Other Mashups for Protein Analysis - 'Protein Characteristics Mashup 1' forms part of a suite of mashups that have been developed to perform  protein analysis. These mashups retrieve known characteristics about a protein from various protein related databases and can perform prediction of functional and structural characteristics for a given protein sequence. The linked page contains a complete list of mashups that have been created for protein analysis.

Additional Resources

Tech Report - this report describes the main frameworks and their facilities, and discusses the potential for sharing and uptake across the community.
MyExperiment BioMashups Page - a resource containing more information about the biomashups that have been developed. Popfly blocks for the mashups described above can be downloaded from here.
Popfly Protein Biomashups Summary Page - a webpage located at Microsoft Popfly that contains a list of all biomashups and all Popfly blocks that have been developed for protein analysis.