Thursday, 9 August 2012

Bioware servers behaving erratically

The SLiMSuite servers housed at bioware.ucd.ie are currently behaving a bit strangely. This is under investigation and we hope to have it sorted out soon. If you spot any odd or unexpected behaviour with the servers, please let us know.

Thursday, 26 July 2012

New software downloads now available

Updated versions of all packages are now available for download. Unfortunately, due to limited time availability, the manuals are getting a little out of date with respect to all the available functions but the readme pages contain all the latest options and defaults. Please contact me if you find any bugs and/or want specific documentation improved. It is on the (long) list of things to try and get done over the summer!

Note that in a slight modification of previous releases, zip downloads now contain the creation date in the name (e.g. rjesuite.2012-07-26.tar.gz) and will be archived.

Monday, 18 June 2012

"Iterative" SLiMMaker function added

The SLiMMaker website (and download once the new release is put up) now has an "iterate" function that will produce both a motif and a set of sequences, all of which match that motif. Basically, the input sequences matching the motif produced by SLiMMaker keep getting put back through SLiMMaker using the same settings until the motif produced matches all of the input (or there is no motif produced). Obviously, if the first SLiM produced matches all of the input, this mode will behave just like the original.

At some point, I will add some more documentation, including some examples.

Tuesday, 15 May 2012

Bioinformatics Postdoc Position available!

A two-year BBSRC-funded postdoc position is now available to work in the Edwards lab developing and applying QSLiMFinder. Informal enquiries are encouraged. You can apply or get further details here. The blurb:
You are invited to apply for the post of Research Fellow to work closely with Dr Richard Edwards on a BBSRC-funded project to develop and apply computational tools for the prediction of protein motifs that mediate protein-protein interactions.

Many protein-protein interactions are mediated by Short Linear Motifs (SLiMs): short stretches of proteins (5-15 amino acids long), of which only a few positions are critical to function. These motifs are vital for biological processes of fundamental importance, such as signalling pathways and targeting proteins to the correct part of a cell.

This position represents an exciting opportunity to join one of the early pioneers in the growing field of SLiM prediction. The primary objective of this project is to integrate a number of leading computational techniques to predict novel SLiMs and, in so doing, add crucial detail to protein-protein interaction networks. This will generate a valuable resource of potential SLiMs, including defined occurrences and interactions.

The project will use a number of computational and sequence analysis techniques. Basic programming skills are essential. Experience with database design, HPC and web programming are desirable. You will be required to develop a thorough knowledge of SLiM-mediated protein-protein interactions and should therefore be comfortable with biological literature, biochemistry, molecular evolution and structural biology.

A background in either computer science or biology, with a PhD in a relevant subject area, is essential. Previous research experience (PhD or Postdoctoral) in computational biology is highly desirable. Candidates with a computer science background must demonstrate an interest and aptitude for molecular biology. Similarly, candidates with a biology background must demonstrate an interest and aptitude for computer programming.

You should be an enthusiastic researcher, a good team-worker and an excellent communicator. Project management skills and independent research experience are desirable.

The position is full-time and available immediately for a period of up to two years.

The closing date for this position is 15 June 2012. Please apply online through www.jobs.soton.ac.uk or alternatively telephone 023 8059 2750 for an application form. Please quote reference number 119512BJ on all correspondence. In addition to submitting your CV, please enclose a personal statement highlighting your research interests and experience, as outlined in the accompanying Further Particulars. Please note that the project is 100% computational.

Wednesday, 9 May 2012

SLiMSuite servers and programs

An emerging field of biology is the role of intrinsically disordered regions in protein function and, specifically, protein-protein interactions (PPI) [1-2]. Of particular interest, Short, Linear Motifs (SLiMs) playing a vital role in disorder-mediated PPI, acting as ligands for molecular signalling, post-translational modifications and subcellular targeting [3]. SLiMs have extremely compact protein interaction interfaces, generally encoded by less than 4 major affinity-/specificity-determining residues within a stretch of 2-10 residues [4]. Their small size enables high functional density and evolutionary plasticity, which is frequently exploited by rapidly evolving pathogens that use them to hijack cellular processes [5]. These same features also make experimental discovery a challenge and considerable attention has therefore been given to computational methods for SLiM prediction and analysis [6].

A number of these tools have been developed by the Edwards and Shields labs [7-11] and made available as part of the SLiMSuite package and online as webservers (http://bioware.ucd.ie) [9-10,12-14], with two new tools, SLiMPrints and QSLiMFinder, currently in preparation for submission, and SLiMMaker to be added soon. The main tools that form the SLiMSuite package/servers are as follows:
  • SLiMFinder [8,13]: de novo SLiM prediction based on a statistical model of over-represented motifs in unrelated proteins.
  • SLiMDisc [7,12]: de novo SLiM prediction based on heuristic ranking of over-represented motifs in unrelated proteins.
  • SLiMPred [11]: de novo SLiM/MoRF prediction in single proteins based machine learning of motif attributes.
  • SLiMSearch [10]: biological context (disorder & conservation) for searches of pre-defined motifs with under- and over-representation statistics, correcting for evolutionary relationships.
  • SLiMSearch 2.0 [14]: biological context (disorder & conservation) and ranking for proteome-wide searches of pre-defined motifs.
  • SLiMPrints (in prep.): de novo SLiM/MoRF prediction in single proteins from statistical clustering of conserved disordered residues.
  • QSLiMFinder (server coming soon): Query-based variant of SLiMFinder with increased sensitivity and specificity.
  • CompariMotif [9]: Motif-motif comparison tool.
  • SLiMMaker (coming soon): Simple tool for converting aligned peptides or SLiM occurrences into a regular expression motif.
  • GOPHER [12]: Automated orthologue prediction and alignment algorithm. Used for conservation-based masking (SLiMFinder/SLiMSearch) and prediction (SLiMPrints).
  • GABLAM [7] (server coming soon): BLAST-based protein similarity scoring and clustering. Used for SLiMFinder and SLiMSearch adjustments for evolutionary relationships.
Personnel (and funding applications) permitting, a number of improvements for these resources are planned, including updates to the underlying databases for proteome-wide predictions (SLiMSearch 1.0 & 2.0), conservation analyses (SLiMSearch 1.0 & 2.0, SLiMPrints, GOPHER) and SLiM comparisons (CompariMotif). We also intend to improve the integration of different tools, allowing seamless continuation of analyses. Motif predictions ((Q)SLiMFinder/SLiMPrints/SLiMPred) will be able to be searched directly against known motifs (CompariMotif) or proteomes (SLiMSearch); GOPHER alignments will be accessible for SLiMPrints analyses and even SLiMSearch/(Q)SLiMFinder input; outputs of motif occurrences ((Q)SLiMFinder/SLiMSearch) can be used to redefine motifs using SLiMMaker etc. If you have any other suggestions for improvements, please let us know.


References:
[1] Tompa P (2011) Unstructural biology coming of age. Curr Opin Struct Biol 21: 419; [2] Babu MM et al. (2011) Intrinsically disordered proteins: regulation and disease. Curr Opin Struct Biol 21:432; [3] Diella F et al. (2008) Understanding eukaryotic linear motifs and their role in cell signaling and regulation. Front Biosci 13:6580; [4] Davey NE et al. (2012) Attributes of short linear motifs. Mol Biosyst 8:268; [5] Davey NE, Trave G & Gibson TJ (2011) How viruses hijack cell regulation. Trends Biochem Sci 36:159; [6] Davey NE, Edwards RJ & Shields DC (2010) Computational identification and analysis of protein short linear motifs. Front Biosci 15:801; [7] Davey NE, Shields DC & Edwards RJ (2006): SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent. Nucleic Acids Res. 34:3546; [8] Edwards RJ, Davey NE & Shields DC (2007): SLiMFinder: A probabilistic method for identifying over-represented, convergently evolved, short linear motifs in proteins. PLoS ONE 2:e967; [9] Edwards RJ, Davey NE & Shields DC (2008): CompariMotif: Quick and easy comparisons of sequence motifs. Bioinformatics 24:1307; [10] Davey NE et al. (2010): SLiMSearch: a webserver for finding novel occurrences of short linear motifs in proteins, incorporating sequence context. Lecture Notes in Bioinformatics 6282:50; [11] Mooney C et al. (2012): Prediction of short linear protein binding regions. J Mol Biol 415:193; [12] Davey NE, Edwards RJ & Shields DC (2007): The SLiMDisc server: short, linear motif discovery in proteins. Nuc Acids Res 35:W455; [13] Davey NE et al. (2010): SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs. Nuc Acids Res 38:W534; [14] Davey NE et al. (2011): SLiMSearch 2.0: biological context for short linear motifs in proteins. Nuc Acids Res 39:W56.

Sunday, 29 April 2012

SLiMMaker: regular expressions from aligned peptide sequences

SLiMMaker has a fairly simple function of reading in a set of sequences and generating a regular expression motif from them. It is designed with protein sequences in mind but should work for DNA sequences too. Input sequences can be in fasta format or just plain text (with no sequence headers) and should be aligned already. Gapped positions will be ignored (treated as Xs) and variable length wildcards are not returned.

SLiMMaker considers each column of the input in turn and compresses it into a regular expression element according to some simple rules, screening out rare amino acids and converting particularly degenerate positions into wildcards. Each amino acid in the column that occurs at least X times (as defined by minseq=X) is considered for the regular expression definition for that position. The full set of amino acids meeting this criterion is then assessed for whether to keep it as a defined position, or convert into a wildcard.

First, if the number of different amino acids meeting this criterion is zero or above a second threshold (maxaa=X), the position is defined as a wildcard. Second, the proportion of input sequences matching the amino acid set is compared to a minimum frequency criterion (minfreq=X). Failing to meet this minimum frequency will again result in a wildcard. Otherwise, the amino acid set is added to the SLiM definition as either a fixed position (if only one amino acid met the minseq criterion) or as a degenerate position. Finally, leading and trailing wildcards are removed.

By default, each defined position in a motif will contain amino acids that (a) occur in at least three sequences each, (b) have a combined frequency of >=75%, and (c) have 5 or fewer different amino acids (that occur in 3+ sequences).

Note. The final motif only contains defined positions that match a given frequency of the input (75% by default). Because positions are considered independently, however, the final motif might occur in fewer than 75% of the input sequences. Results will indicate the coverage of the input data but SLiMSearch can be used to check the occurrence stats more thoroughly.

Citation: SLiMMaker is part of the ongoing benchmarking of QSLiMFinder, which should be submitted for publication soon. In the meantime, please cite the SLiMMaker URL: http://bioware.soton.ac.uk/slimmaker.html.

Availability: SLiMMaker is available on request and will shortly be part of the SLiMSuite package.


Monday, 16 April 2012

New software downloads now available

Updated software packages for SeqSuite, SLiMSuite and RJESuite are now available from the Edwards Lab software page. These downloads incorporate a variety of miscellaneous bug fixes and minor updates.