Bioinformatics is blessed with an abundance of software tools, most of which focus on performing a
well-defined task, and performing it well. Most researchers leverage these tools as components in
more complex pipelines or workflows, allowing rapid investigation of larger and more complex problems.
Yet this power comes at a cost, requiring the installation of the component tools, and linking them
together through scripting or by configuring complex workflow software, all of which acts as a barrier
to rapid, curiosity-driven exploration. And while there are encouraging signs that people are starting
to share scientific workflows through sites such as MyExperiment, re-use remains inconvenient, and many
researchers settle for a local re-invention, however common the wheel required.
Web mashups rapidly combine data and services from multiple sources within a web based application, and have long been seen as an enabler of a new kind of computational science. Among the technically savvy, mashups are routine, but are frequently ad hoc, limited in scope and eminently disposable. The model has seemed to promise more, to handle more substantial computational experiments, and to support a culture of fine-grained sharing and re-use. Mashup frameworks are rapidly attaining maturity, and appear ready to deliver on some of these promises. Serious, mashup-based bioinformatics is now possible in any of the major frameworks, from Google Mashups, to Yahoo Pipes and the more recent Popfly offering from Microsoft. All have their advantages and disadvantages, but the block composition model in Popfly appears particularly convenient as a vehicle for re-use. Regardless of the framework, uptake of serious biomashups requires a concerted effort to support a community-based component library, and to make the broader scientific community aware of the capabilities of the approach. Perhaps some of the more active groups will need to seed these libraries, and to show the power of the resulting mashups.
On this page, we show some simple biomashups created in Popfly, each related to some annotation task. The mashup examples highlight the block composition model in Popfly, and are intended as an initial seed of a broader group of components. The tasks are as follows:
- Uniprot - Genbank Id translator: Mapping Uniprot Protein Accession Ids to their GenBank equivalents.
- Genbank to Pubmed: Retrieving articles from the Pubmed database that reference a user-specified GenBank Accession Id. This mashup is immediately useful to a researcher who has a specific gene of interest and wants to find all known information about the gene, and as a component of a more elaborate annotation mashup.
- SLDM1: The final demo is based on a fairly elaborate undergraduate teaching exercise from QUT, in which the student is given a protein coding nucleotide sequence, and is asked successively to determine the gene family and to perform an NCBI Entrez search to obtain more information about the gene.
The accompanying tech report looks at the main frameworks and their facilities, and discusses the potential for sharing and uptake across the community.

