tag:blogger.com,1999:blog-44658031132493281542024-03-13T18:49:47.849+00:00SLiMSuite & SeqSuite: open-source bioinformatics in PythonA blog of all things pertaining to the SeqSuite & SLiMSuite open-source bioinformatics packages, their authors and applications.seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.comBlogger50125tag:blogger.com,1999:blog-4465803113249328154.post-45292176785941773022014-06-26T08:59:00.000+01:002015-05-26T09:02:49.096+01:00New SLiMSuite Blog<p>This Blog has now been retired. Please visit the <a href="http://slimsuite.blogspot.com.au/">new SLiMSuite blog</a> (and update any bookmarks).</p>Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com1tag:blogger.com,1999:blog-4465803113249328154.post-14613265742126540652014-06-23T03:13:00.001+01:002014-06-23T03:13:17.107+01:00New SLiMSuite release now available<a href="http://slimsuite.blogspot.com/2014/06/a-new-download-of-slimsuite-release.html?spref=bl">SLiMSuite Short Linear Motif discovery and analysis: New SLiMSuite release now available</a>: A new download of SLiMSuite (release 2014-06-22 ) is now available. As well as fixing the minor GOPHER output bug , a new Taxonomy proces...Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-75990575497345647312014-05-14T13:28:00.001+01:002014-05-14T13:28:59.497+01:00SLiMSuite Short Linear Motif discovery and analysis: Minor bug in GOPHER output with BLAST+<a href="http://slimsuite.blogspot.com/2014/05/minor-bug-in-gopher-output-with-blast.html?spref=bl">SLiMSuite Short Linear Motif discovery and analysis: Minor bug in GOPHER output with BLAST+</a>: A bug has been identified with the current SLiMSuite release when using BLAST+ to generate orthologue alignments with GOPHER. Sequences extr...Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-26471936104064342542014-04-25T12:06:00.001+01:002014-04-25T12:07:45.813+01:00SLiMSuite Short Linear Motif discovery and analysis: Blog switchoverPosts and pages from this blog have now been imported into a new <a href="http://slimsuite.blogspot.com/2014/04/blog-switchover.html?spref=bl">SLiMSuite Short Linear Motif discovery and analysis</a> blog, which will take over as the main source of ongoing news, tips, documentation and updates. Posts will be cross-posted here for a while before eventually this blog is discontinued.Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-40311741064678582162014-04-23T14:00:00.000+01:002014-04-23T14:00:33.442+01:00SLiMSuite 2014-04-22 now available<p>A <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/index.html">new download of SLiMSuite</a> (release 2014-04-22) is now available. As well as fixing the <code>gopher.py</code> error, the download page and <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/readme/readme.html">readme</a> have had a slight makeover, which should make them load quicker.</p>
<p>As part of ongoing consolidation and documentation, <code>SeqSuite</code> has now been incorporated into in a single <code>SLiMSuite</code> download. (Previously, <code>SLiMSuite</code> was available as a reduced set of programs and <code>SeqSuite</code> had the full set.) The intention is to retire the <code>SeqSuite</code> moniker over the coming months, although the programs themselves will still be available.</p>
<p>The lastest release also features a new program, <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/readme/tools/slimfarmer.html">SLiMFarmer</a>, for running (Q)SLiMFinder and SLiMProb batch jobs on parallel processors. <code>SLiMFarmer</code> is still under development and should hopefully work with other SLiMSuite programs too but has not yet been tested.</p>
<p>Other miscellaneous updates are listed below. </p>
<H3>Updates since last release:</H3>
<p><b>• comparimotif_V3:</b> <i>Updated from Version 3.10.</i>
<br>→ Version 3.10: Added forking.
<br>→ Version 3.11: Added additional overlap/matchfix checks during basic comparison to try and speed up.
<br>→ Version 3.12: Replaced deprecated sets.Set() with set().
</p>
<p><b>• gablam:</b> <i>Updated from Version 2.11.</i>
<br>→ Version 2.12: Consolidated use of BLAST V2.
</p>
<p><b>• haqesac:</b> <i>Updated from Version 1.9.</i>
<br>→ Version 1.10: Added exceptions for BLAST failure.
</p>
<p><b>• picsi:</b> <i>Updated from Version 1.1.</i>
<br>→ Version 1.2: Updated to BUDAPEST 2.3 and rje_mascot.
</p>
<p><b>• pingu_V4:</b> <i>Created.</i>
<br>→ Version 4.0: Initial Compilation based on code from SLiMBench and PINGU 3.9 (inherited as pingu_V3).
<br>→ Version 4.1: Adding compilation of PPI databases using new rje_xref V1.1 and older objects from PINGU V3.
<br>→ Version 4.2: Bug fixes for use of PPISource to create PPI databases.
</p>
<p><b>• qslimfinder:</b> <i>Updated from Version 1.6.</i>
<br>→ Version 1.7: Fixed "MustHave=LIST" correction of motif space.
</p>
<p><b>• seqmapper:</b> <i>Updated from Version 2.0.</i>
<br>→ Version 2.1: Added catching of failure to read input sequences. Removed 'Run' from GABLAM table.
</p>
<p><b>• slimbench:</b> <i>Updated from Version 2.0.</i>
<br>→ Version 2.1: Fixed memsaver=T unless in development mode (dev=T). Removed old Assessment. Tested with simbench analysis.
<br>→ Version 2.2: Replaced searchini=LIST with searchini=FILE and moved to SimBench commands.
<br>→ Version 2.2: Modified the FN/TN and ResNum calculations. No longer rate TP in random data as OT.
</p>
<p><b>• slimfarmer:</b> <i>Created.</i>
<br>→ Version 0.0: Initial Compilation.
<br>→ Version 1.0: Functional version using rje_qsub and rje_iridis to fork out SLiMSuite runs.
<br>→ Version 1.1: Updated to use rje_hpc.JobFarmer and incorporate main SLiMSuite farming within SLiMFarmer class.
</p>
<p><b>• slimfinder:</b> <i>Updated from Version 4.5.</i>
<br>→ Version 4.6: Minor modification to seqocc=T function. !Experimental! Added main occurrence output and modified savespace.
</p>
<p><b>• slimmutant:</b> <i>Created.</i>
<br>→ Version 0.0: Initial Compilation.
<br>→ Version 1.0: Working version with standalone functionality.
</p>
<p><b>• slimprob:</b> <i>Updated from Version 1.0.</i>
<br>→ Version 1.1: Tidied import commands.
<br>→ Version 1.2: Increased extras=X levels. Adjusted maxsize=X assessment to be post-masking.
</p>
<p><b>• ned_rankbydistribution:</b> <i>Updated from Version 1.1.</i>
<br>→ Version 1.2: Replaced depracated Set module.
</p>
<p><b>• rje:</b> <i>Updated from Version 4.8.</i>
<br>→ Version 4.9: Added rje.slimsuite, which determines the slimsuite home directory from rje.py file path.
<br>→ Version 4.10: Added osx=T/F option for Mac-specific running options.
</p>
<p><b>• rje_blast_V2:</b> <i>Updated from Version 2.4.</i>
<br>→ Version 2.5: Minor modifications for SLiMCore UPC generation.
<br>→ Version 2.6: Minor bug fixes.
</p>
<p><b>• rje_db:</b> <i>Updated from Version 1.2.</i>
<br>→ Version 1.3: Minor modifications for SLiMCore FUPC development.
<br>→ Version 1.4: Added list checking with addEmptyTable.
</p>
<p><b>• rje_dismatrix_V2:</b> <i>Updated from Version 2.9.</i>
<br>→ Version 2.10: Minor modifications for SLiMCore UPC.
</p>
<p><b>• rje_genemap:</b> <i>Updated from Version 1.4.</i>
<br>→ Version 1.5: Minor tweak of expected HGNC input following change to downloads.
</p>
<p><b>• rje_hpc:</b> <i>Created.</i>
<br>→ Version 1.0: Initial Compilation based on rje_iridis V1.10.
</p>
<p><b>• rje_iridis:</b> <i>Updated from Version 1.9.</i>
<br>→ Version 1.10: Modified freemem setting to run on Katana. Made rsh optional. Removed defunct IRIDIS3 option.
</p>
<p><b>• rje_obj:</b> <i>Updated from Version 1.3.</i>
<br>→ Version 1.4: Added sourceDataFile() method from SLiMBench for wider use.
<br>→ Version 1.5: Added 'basestr' and 'basefile' cmdlist types.
<br>→ Version 1.6: Added osx=T/F option for Mac-specific running options.
</p>
<p><b>• rje_qsub:</b> <i>Updated from Version 1.4.</i>
<br>→ Version 1.5: Added emailing of job stats after run. Added vmem limit.
</p>
<p><b>• rje_seq:</b> <i>Updated from Version 3.17.</i>
<br>→ Version 3.18: Minor BLAST+ bug fixes. Added exceptions to readBLAST failure.
</p>
<p><b>• rje_seqlist:</b> <i>Updated from Version 1.3.</i>
<br>→ Version 1.4: Added dna2prot reformat function.
</p>
<p><b>• rje_slimcore:</b> <i>Updated from Version 1.12.</i>
<br>→ Version 1.13: Modified the savespace settings to reduce numbers of files. targz file now uses RunID not Build Info.
<br>→ Version 1.14: Started adding code for Fragmented UPC (FUPC) clustering.
</p>
<p><b>• rje_slimlist:</b> <i>Updated from Version 1.2.</i>
<br>→ Version 1.3: Added auto-download of ELM data.
</p>
<p><b>• rje_uniprot:</b> <i>Updated from Version 3.14.</i>
<br>→ Version 3.14: Added dblist=LIST and dbsplit=T/F for additional DB link output control. Set unipath default to url.
<br>→ Version 3.15: Added extraction of taxonomic groups. Add UniFormat to improve pure downloads.
<br>→ Version 3.16: Added WBGene ID's from WormBase as one of the recognised DB XRef to parse.
<br>→ Version 3.17: Efficiency tweak to URL-based extraction of acclist.
<br>→ Version 3.18: Minor modification to database parsing.
</p>
<p><b>• rje_xref:</b> <i>Updated from Version 1.0.</i>
<br>→ Version 1.1: Added output of ID lists to text files. Major reworking. Tested with HPRD and HGNC.
</p>
Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-66713038210216821202014-04-08T08:35:00.000+01:002014-04-08T08:35:17.305+01:00Missing gopher.py file<p>There is a bug with the current software download, with a file missing from the <code>libraries/</code> directory. The download will hopefully be updated soon but in the meantime please email <code>richard.edwards[at]unsw.ed.au</code> and I will send you the file.</p>Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-86054461029811853532014-01-14T11:02:00.000+00:002014-01-14T11:02:34.883+00:00Using SLiMFinder on Phage Display Data (or other peptides)<p>Although SLiMFinder is designed with whole protein sequences in mind, it can also be used to identify statistically over-represented motifs in peptide data, including phage display results. Indeed, it is the third example application in <a href="http://www.southampton.ac.uk/~re1u06/downloads/edwards_et_al_2007.pdf">the original SLiMFinder paper</a>.</p>
<p>Unfortunately, the <a href="http://bioware.ucd.ie/~compass/biowareweb/Server_pages/slimfinder.php">SLiMFinder webserver</a> is currently not set up for phage display analysis, so if you are interested in this kind of work then you will need to <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/">download SLiMSuite</a>.</p>
<p>Suggested settings for phage display data are below. If anyone has a go and/or wants more advice, please get in touch. (If you try it, I’d be interested to hear how well it works!) Similarly, if you want some advice/ideas on how to combine the peptides with interaction data and full length protein sequences for a more sophisticated analysis, send me a bit more info and I’d be happy to make some suggestions.</p>
<h2>Custom settings for phage display data</h2>
<p>Here is an overview of the settings that should be tweaked for phage display analysis:</p>
<p><strong>Amino acid frequencies.</strong> One thing you will want to try is changing the way that the amino acid frequencies are used. By default, SLiMFinder will use the amino acid frequencies of the input dataset but for phage display peptides this is not really right as the peptides are clearly biased in their composition due to the motifs they contain. Instead, you probably want to set the amino acid frequencies for the background model to those of the human proteome (for human peptides) or even a uniform amino acid distribution. (Select frequencies that model the <em>pre-screening</em> amino acid frequencies.) This is done using the <code>aafreq=FILE</code> option, where <code>FILE</code> can be a fasta file of protein sequences or a delimited file of aa frequencies with the headings “AA” and “FREQ”. (See <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/docs/SLiMFinder%20Manual.pdf">the manual</a> for details.) If in doubt, try a few runs with different amino acid frequencies.</p>
<p><strong>Evolutionary Filtering.</strong> Evolutionary filtering should be switched <strong>off</strong> (<code>efilter=F</code>) but you will also want to make sure that there is no redundancy in your peptides. (<code>rje_seq.py</code> can be used for this.)</p>
<p><strong>SLiMChance.</strong> If you are not so interested in the statistical significance and primarily want to use SLiMFinder to return a ranked list of interesting motifs in the data, set <code>sigcut=1.0</code> and choose the number of motifs to return with <code>topranks=X</code>.</p>
<p><strong>Ambiguity.</strong> Peptide data is usually pretty quick to run, and so it is probably worth exploring the full range of ambiguity with <code>combamb=T</code> (combined amino acid and variable-lengh wildcards). The basic <code>equiv=LIST</code> set for aa degeneracy should be OK for most jobs but you can easily tweak it to add or remove ambiguity combinations as appropriate.</p>
<p><strong>Masking.</strong> You will probably want to <strong>switch off all masking</strong> (<code>masking=F</code>). Low complexity masking <em>might</em> be useful but <code>metmask=F posmask=""</code> should be used as the N-termini are not true protein N-termini.</p>Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com4tag:blogger.com,1999:blog-4465803113249328154.post-12147426755379763082013-12-03T06:37:00.000+00:002013-12-03T06:37:39.822+00:00File management for large SLiMSuite runs<p>The <a href="http://seqsuite.blogspot.com.au/2013/12/new-downloads-and-fixed-webpages.html">latest release</a> of <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/">SLiMSuite</a> features a slight modification to the way that files are generated and tidied, which can be beneficial for large runs.</p>
<p>Previously, a different results directory (<code>resdir=PATH</code>) was required for each different run to avoid dataset-specific results being over-written. The partial exception was the <code>*.pickle.gz</code> file, which included some <strong>SLiMBuild</strong> information in its name. (This is predominantly to speed up the ability of (Q)SLiMFinder to recognise when an intermediate pickle file can be used or not.) As of the latest release, the <strong>RunID</strong> (<code>runid=X</code>) is also now included in dataset-specific output, allowing results from several different runs (with different <code>RunID</code>s) to go into the same results directory.</p>
<p>The exception is the files that are created as part of the initial setup/SLiMBuild process: <code>*.slimdb</code>, <code>*.dis.tdt</code> and <code>*.upc</code>. From a given <code>Dataset</code> and <code>RunID</code>, the following files will therefore be generated in <code>ResDir/</code> </p>
<pre><code>Dataset.RunID.cloud.txt
Dataset.RunID.mapping.fas
Dataset.RunID.maskaln.fas
Dataset.RunID.masked.fas
Dataset.RunID.motifaln.fas
Dataset.RunID.occ.csv
Dataset.dis.tdt
Dataset.#SLiMBuild-Text#.pickle.gz
Dataset.slimdb
Dataset.upc</code></pre>
<p>Note that the default <strong>ResDir</strong> is <code>SLiMFinder/</code>, <code>QSLiMFinder/</code> or <code>SLiMProb</code> and the default <strong>RunID</strong> is the date and time of the run.</p>
<h2 id="targzandsavespace">TarGZ and SaveSpace</h2>
<p>Obviously, the results directory can quickly fill up with files if there are multiple datasets and/or runs with different RunIDs. The way to get round this is to use the <code>targz=T</code> and <code>savespace=X</code> options.</p>
<p><code>targz=T</code> will package up all of the files associated with a specific run into a single <code>Dataset.RunID.tgz</code> file. This does not work on Windows. (Note that previous versions generated a <code>Dataset.tar.gz</code> file.) The <code>*.pickle.gz</code> file associated with the run will <em>not</em> be included in the tar file <em>unless</em> <code>savespace=2+</code> (see below).</p>
<p><strong>Note:</strong> the tar file is actually generated from the <em>run</em> directory, not the <em>results</em> directory and will include the relative path to <strong>ResDir</strong> in the tarred files. This means that if you enter <code>ResDir/</code> and then <code>tar -xzf Dataset.RunID.tgz</code>, an additional <code>ResDir/</code> will be created in which the files can be found. This is actually pretty useful as it allows the user to unpack individual runs and then delete the whole directory when finished. To return individual results to their “rightful” place, simply run the tar command from the same directory that the SLiMSuite program was run from (<em>e.g.</em> <code>tar -xzf ResDir/Dataset.RunID.tgz</code>).</p>
<p>The <code>savespace=X</code> option saves space by deleting excess files. <strong>It is strongly recommended that this is used in conjunction with the <code>targz=T</code>.</strong> There are now four levels of <code>savespace=X</code>:</p>
<ul>
<li>0 = Delete no files</li>
<li>1 = Delete all bar *.upc and *.pickle (Pickle excluded from tar.gz with this setting)</li>
<li>2 = Delete all bar *.upc files (Pickle included in tar.gz with this setting)</li>
<li>3 = Delete all dataset-specific files including *.upc and *.pickle (not *.tar.gz)</li>
</ul>
<p>Another way to think of this is that <code>0</code> will delete nothing, <code>1</code> will leave enough files to rerun the <em>same</em> dataset/SLiMBuild combination, <code>2</code> will leave enough to run the same dataset with <em>additional</em> SLiMBuild settings, whilst <code>3</code> will cleanup absolutely everything.</p>
<p>The recommended setting for running on a cluster or supercomputer is <code>targz=T savespace=1</code> unless file <em>numbers</em> are an issue, in which case <code>targz=T savespace=2</code> would be better. <code>targz=T savespace=3</code> is only really recommended when you are confident that all datasets will run to completion without issues. If there is a chance of nodes going down or walltimes being reached, it is better to keep the pickle files accessible for re-runs.</p>Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-42805296713392862662013-12-03T05:56:00.000+00:002013-12-03T06:38:47.138+00:00New downloads and fixed webpages<p>New releases of <a href="http://www.southampton.ac.uk/~re1u06/software/packages/seqsuite/">SeqSuite</a> and <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/">SLiMSuite</a> are now available. The webpages have now hopefully been fixed too, including the broken Manual links. (A bit of trouble parsing some the docstrings had messed up the HTML, in case you care!) Please report any more anomalies.</p>
<p>There are not many major updates since the last release. The biggest are that <strong>SLiMFinder</strong> (and <strong>QSLiMFinder</strong>) now produce a single <code>*.occ.csv</code> containing motif instances for <em>all</em> datasets, in addition to the old dataset-specific files. This is to make the output more consistent with <strong>SLiMProb</strong> although do note that some of the column headers are different. The new file contains the same data as the old dataset-specific <code>*.occ.csv</code> files plus two additional columns: <code>Dataset</code> and <code>RunID</code>. (These match the main <code>*.csv</code> output.)</p>
<p>Dataset-specific results files have also been cleaned up a little for <strong>(Q)SLiMFinder</strong> and <strong>SLiMProb</strong> (<em>i.e.</em> the <strong>SLiMCore</strong> Class in <code>libraries/rje_slimcore</code>) to make the <code>targz=T/F</code> and <code>savespace=X</code> options a little more useful and consistent. This will be the subject of <a href="http://seqsuite.blogspot.com.au/2013/12/file-management-for-large-slimsuite-runs.html">another post</a> shortly.</p>
<p>Other miscellaneous updates are listed below.</p>
<H3>Updates since last release:</H3>
<p><b>• comparimotif_V3:</b> <i>Updated from Version 3.10.</i>
<br>→ Version 3.10: Added forking.
<br>→ Version 3.11: Added additional overlap/matchfix checks during basic comparison to try and speed up.
</p>
<p><b>• qslimfinder:</b> <i>Updated from Version 1.6.</i>
<br>→ Version 1.7: Fixed "MustHave=LIST" correction of motif space.
</p>
<p><b>• slimfinder:</b> <i>Updated from Version 4.5.</i>
<br>→ Version 4.6: Minor modification to seqocc=T function. !Experimental! Added main occurrence output and modified savespace.
</p>
<p><b>• rje_pydocs:</b> <i>Updated from Version 2.8.</i>
<br>→ Version 2.8: Added docsource=PATH : Input path for Python Module documentation (manuals etc.) ['../docs/']
<br>→ Version 2.9: Attempts to fix some broken links and sort out manuals confusion
</p>
<p><b>• rje_slimcore:</b> <i>Updated from Version 1.12.</i>
<br>→ Version 1.13: Modified the savespace settings to reduce numbers of files. targz file now uses RunID not Build Info.
</p>
<p><b>• rje_uniprot:</b> <i>Updated from Version 3.14.</i>
<br>→ Version 3.14: Added dblist=LIST and dbsplit=T/F for additional DB link output control. Set unipath default to url.
<br>→ Version 3.15: Added extraction of taxonomic groups. Add UniFormat to improve pure downloads.
</p>
Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-64323162196567480222013-11-29T00:55:00.002+00:002013-11-29T00:55:51.829+00:00Wonky webpagesIt has come to my attention that the formatting has got a bit messed up at the <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/">SLiMSuite download pages</a>. A new release of the downloads will be made soon and hopefully these kinks can get ironed out at the same time. (I'm not sure what's happened!)Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-13755004385269271342013-11-15T09:51:00.000+00:002013-11-15T09:51:04.800+00:00SLiMSuite Down Under<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3hz5tl-phh-KXEZkfjzOGi-Q79qN_qEVSP18Xij4rghxqAZpx6Ff_nWuKAvI_2PM3MWAweI-UobS-ehSHhXwnDbNZ_Z1gz54WaBm3x9hxId4jgi5JErBRn1ZDrnfW3teGPxoQbGO350QC/s1600/UNSW_LandscapeColourPos.tif" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3hz5tl-phh-KXEZkfjzOGi-Q79qN_qEVSP18Xij4rghxqAZpx6Ff_nWuKAvI_2PM3MWAweI-UobS-ehSHhXwnDbNZ_Z1gz54WaBm3x9hxId4jgi5JErBRn1ZDrnfW3teGPxoQbGO350QC/s250/UNSW_LandscapeColourPos.tif" align="right" style="margin-left:10px" /></a>Rich has recently moved to Sydney, Australia to take up a position at the <a href="http://edwardslab.blogspot.co.uk/2013/11/now-at-university-of-new-south-wales.html">University of New South Wales (UNSW)</a>. As a result, things are a bit disrupted at present but a better-than-normal service should resume shortly, as should continuing to update the documentation. There are also plans to mirror the <a href="http://bioware.ucd.ie">Bioware servers</a> in UNSW, so watch this space. </p><p>If you are in Sydney and fancy a SLiM-related job, Rich also has a <a href="http://edwardslab.blogspot.co.uk/2013/11/postdoc-opportunity-in-short-linear.html">postdoc opportunity</a> at present.</p>Richard Edwardshttp://www.blogger.com/profile/16115218690707131186noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-48402843956475956382013-08-23T16:18:00.002+01:002013-08-30T09:20:14.871+01:00A note on using BLAST+ with SLiMSuite<p>One of the major changes in the <a href="http://seqsuite.blogspot.co.uk/2013/08/new-software-release.html">last release</a> was the incorporation of BLAST+ as a replacement for BLAST. It should be noted that BLAST+ has not been benchmarked with SLiMSuite and it is not clear how and when it will behave differently, particularly with regards to UPC generation (i.e. generating clusters of unrelated proteins). </p>
<p>Early indications are that BLAST+ has a greater tendency to return no hits for short sequences. This can cause issues with SLiMSuite programs if <code>oldblast=F</code>. This will be fixed in the next release but running with <code>dev=T</code> gets round this issue in the meantime.</p>
<p>Please note that UPC may be different with BLAST versus BLAST+. This will need to be the focus of further study.</p>
seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-86877777143050537502013-08-22T14:30:00.000+01:002013-08-22T14:30:27.241+01:00Log Files<p>Every program generates a log file when it is run. By default, this file will be named after the calling program (<em>e.g.</em> <code>gasp.py</code> will produce a log called <code>gasp.log</code>) but this can be changed with the <code>log=FILE</code> option. The <code>basefile=X</code> option will also set the base name of the log file, as well as the main results files (for most programs). Logs will be appended unless the <code>newlog</code> (or <code>newlog=T</code>) option is used.</p>
<p>The log file records information that may help subsequent interpretation of results or identify problems. Each line is tab delimited in the form:</p>
<pre><code>#XXX HH:MM:SS Log Message.</code></pre>
<p>Where <code>#XXX</code> is an identifier that can be used to parse out specific types of information, <code>HH:MM:SS</code> is the runtime in hours, minutes and seconds, and <code>Log Message</code> will be something (hopefully) informative.</p>
<p>All log files start with the same few lines: </p>
<pre><code>#~~# #~~# #~~#
#LOG 00:00:00 Activity Log for PROGRAM X.X: DATE TIME YEAR
#DIR 00:00:00 Run from directory: RUNPATH
#ARG 00:00:00 Commandline arguments: ARGLIST
#CMD 00:00:00 Full Command List: [FULL ARGLIST]</code></pre>
<p>This should contain all the information required to repeat the analysis:</p>
<ul>
<li><code>PROGRAM X.X: DATE TIME YEAR</code> will have the program name, version number and the date/time of the run.</li>
<li><code>RUNPATH</code> is the directory from which the program was run.</li>
<li><code>ARGLIST</code> is the list of command-line arguments given to the program.</li>
<li><code>FULL ARGLIST</code> is the full list of command-line arguments including any arguments read in from ini files.</li>
</ul>
<p>The last line can help identify the source of any unexpected behaviour due to default settings <em>etc.</em> </p>
<p>(The <code>#~~# #~~# #~~#</code> line is simply to act as a separator if appending an existing log file.)</p>
<p>If the program runs to completion successfully, it will end with another <code>#LOG</code> line:</p>
<pre><code>#LOG HH:MM:SS PROGRAM V:X.X End: DATE TIME YEAR</code></pre>
<p>If this line is not present then something went wrong during the run (see Error Messages, below - or it is still in progress. Other information is also recorded along with the runtime (<code>HH:MM:SS</code> since the program started). For help interpreting log files, please check the relevant software manual or contact me if the information is missing. (Hopefully, the log content is mostly self-explanatory but I shall add any explanations I have to send people to the relevant manual’s appendix.)</p>
<h3 id="errormessages">Error Messages</h3>
<p>One of the most important aspects of the log file is to register any error messages. These are marked by an <code>#ERR</code> line header. Hopefully, there will not be any but if there was a problem with the run then these lines should contain the details. To catch these lines separately, <code>errorlog=FILE</code> will output error messages to an <em>additional</em> file.</p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-15578326951046963752013-08-21T14:52:00.001+01:002013-08-21T22:42:07.723+01:00New Software Release<p>New releases of <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/">SLiMSuite</a> and <a href="http://www.southampton.ac.uk/~re1u06/software/packages/seqsuite/">SeqSuite</a> are now available. Please note that <b>RJESuite</b> has now been discontinued - for simplicity, all of the extra gubbins is now part of the SeqSuite release. <b>SLiMSuite</b> still represents a cut-down version that focuses on Short Linear Motif analysis tools.</p>
<p>There have been a number of updates since the last release, which will be the focus of future posts. The biggest change since the last release is the implementation of <a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download">BLAST+</a> as the default in place of BLAST for most tools. The old BLAST can still be invoked using the <code>oldblast=T</code> switch. In addition to <code>blastpath=PATH</code>, a new <code>blast+path=PATH</code> parameter will need to be set.</p>
<p>Apart from some file organisation tweaks, the other major change is that <strong>CompariMotif</strong> now has a <code>memsaver=T</code> mode, which will process very large motif lists much quicker and avoid memory issues. The XGMML output is not (yet) available in this mode. For multi-processor CPUs and large <code>searchdb</code> motif lists, CompariMotif now also supported forking (<code>forks=X</code>).
</p>
<p>Documentation is in the process of having an overhaul and is still lagging behind as a result. Please ask if anything is unclear and that section of documentation will be prioritised.</p>
<H3>Updates since last release:</H3>
<p><b>• aphid:</b> <i>Updated from Version 2.0.</i>
<br>→ Version 2.1: Reduced import commands.
</p>
<p><b>• budapest:</b> <i>Updated from Version 2.1.</i>
<br>→ Version 2.2: Removed unrequired rje_dismatrix import.
<br>→ Version 2.3: Updated to use rje_blast_V2. Needs further updates for BLAST+. Deleted obsolete OLDreadMascot() method.
</p>
<p><b>• comparimotif_V3:</b> <i>Updated from Version 3.9.</i>
<br>→ Version 3.10: Added MemSaver option, which will read and process input motifs (not searchdb) one motif at a time.
<br>→ Version 3.10: Added forking.
</p>
<p><b>• fiesta:</b> <i>Updated from Version 1.5.</i>
<br>→ Version 1.6: Removed HAQESAC import (uses MultiHAQ).
<br>→ Version 1.7: Updated to use rje_blast_V2. Needs work to make function with BLAST+.
</p>
<p><b>• gablam:</b> <i>Updated from Version 2.10.</i>
<br>→ Version 2.11: Altered to use BLAST+ and rje_blast_V2.
</p>
<p><b>• gasp:</b> <i>Updated from Version 1.3.</i>
<br>→ Version 1.4: Minor tweaks to imports.
</p>
<p><b>• gfessa:</b> <i>Updated from Version 1.2.</i>
<br>→ Version 1.3: Tidied module imports.
<br>→ Version 1.4: Switched to rje_blast_V2. More work needed for BLAST+.
</p>
<p><b>• haqesac:</b> <i>Updated from Version 1.8.</i>
<br>→ Version 1.9: Added rje_blast_V2 implementation and BLAST+. Use oldblast=T for old BLAST.
</p>
<p><b>• peptcluster:</b> <i>Updated from Version 1.3.</i>
<br>→ Version 1.4: Bug fixes for end of sequence characters and different length peptides.
</p>
<p><b>• picsi:</b> <i>Updated from Version 1.0.</i>
<br>→ Version 1.1: Updated to blast_V2 and BLAST+.
</p>
<p><b>• pingu:</b> <i>Updated from Version 3.8.</i>
<br>→ Version 3.9: Tidied imports.
</p>
<p><b>• qslimfinder:</b> <i>Updated from Version 1.5.</i>
<br>→ Version 1.6: Removed excess module imports.
</p>
<p><b>• slimbench:</b> <i>Updated from Version 1.8.</i>
<br>→ Version 1.9: Added memsaver option. Replaced SLiMSearch with SLiMProb. Altered default IO paths.
<br>→ Version 1.9: Removed 3DID again: new ELM interaction_domains file has position-specific PPI details.
<br>→ Version 2.0: Major overhaul of input options to standardise/clarify. Implemented auto-downloads and PPI datasets.
</p>
<p><b>• slimprob:</b> <i>Updated from Version 1.0.</i>
<br>→ Version 1.1: Tidied import commands.
</p>
<p><b>• slimsuite:</b> <i>Created.</i>
<br>→ Version 0.0: Initial Compilation with downloadelm function.
</p>
<p><b>• rje_pydocs:</b> <i>Updated from Version 2.6.</i>
<br>→ Version 2.7: Added rje_ppi output for module links.
<br>→ Version 2.8: Added parsing of commandline options from docstring and cmdRead calls.
<br>→ Version 2.8: Added docsource=PATH : Input path for Python Module documentation (manuals etc.) ['../docs/']
</p>
<p><b>• rje:</b> <i>Updated from Version 4.6.</i>
<br>→ Version 4.7: Added self.warn list and self.warnLog() functions to Log object. Modified i=-1 quitchoice to raise not quit.
<br>→ Version 4.8: Added perc cmdtype = float that is multiplied by 100.0 if < 1.0. Removed server option from iniCmds().
</p>
<p><b>• rje_ancseq:</b> <i>Updated from Version 1.2.</i>
<br>→ Version 1.3: Changed "biproblem" error handling in gaspProbs()
</p>
<p><b>• rje_blast_V1:</b> <i>Updated from Version 1.14.</i>
<br>→ Version 1.15: Added OldBLAST/Legacy option to Object for compatibility with rje_blast_V2. (Always True!)
</p>
<p><b>• rje_blast_V2:</b> <i>Updated from Version 2.1.</i>
<br>→ Version 2.2: Added gablamData() to return old-style GABLAM dictionary from table.
<br>→ Version 2.3: Added blastCluster() method to return UPC clustering and GABLAM distance matrix from a file.
<br>→ Version 2.4: Scrapped BLAST "Run" field to simplify code - keep a single run per BLASTRun object.
</p>
<p><b>• rje_db:</b> <i>Updated from Version 1.0.</i>
<br>→ Version 1.1: Added sortedEntries() function.
<br>→ Version 1.2: Added Table.hasField(field). Add openTable(), readEntry() and readSet() methods.
</p>
<p><b>• rje_forker:</b> <i>Created.</i>
<br>→ Version 0.0: Initial Compilation.
</p>
<p><b>• rje_iridis:</b> <i>Updated from Version 1.8.</i>
<br>→ Version 1.9: Added scanning of legacy folder - moving GOPHER_V2!
</p>
<p><b>• rje_obj:</b> <i>Updated from Version 1.0.</i>
<br>→ Version 1.1: Added rje_zen import and self.zen() to call rje_zen.Zen().wisdom().
<br>→ Version 1.2: Added warnLog functions.
<br>→ Version 1.3: Added perc cmdtype = float that is multiplied by 100.0 if < 1.0. Also added cmdtype = date for YYYY-MM-DD.
</p>
<p><b>• rje_ppi:</b> <i>Updated from Version 2.7.</i>
<br>→ Version 2.8: Tweaked Spring Layout. Stores original Hub and Spoke Field.
</p>
<p><b>• rje_seq:</b> <i>Updated from Version 3.16.</i>
<br>→ Version 3.17: Updated to use BLAST+ and rje_blast_V2
</p>
<p><b>• rje_sequence:</b> <i>Updated from Version 2.2.</i>
<br>→ Version 2.3: Added alternative self.info keys for sequence (for UniProt splice variants). Added SpliceVar dict.
</p>
<p><b>• rje_slimcore:</b> <i>Updated from Version 1.10.</i>
<br>→ Version 1.11: Tidied some of the module imports.
<br>→ Version 1.12: Upgraded BLAST to BLAST+. Can use old BLAST with oldblast=T.
</p>
<p><b>• rje_slimlist:</b> <i>Updated from Version 1.1.</i>
<br>→ Version 1.2: Added some extra functions for CompariMotif Memsaver mode
</p>
<p><b>• rje_tree:</b> <i>Updated from Version 2.9.</i>
<br>→ Version 2.10: Added cleanup of *.r.csv file following R-based PNG generation.
</p>
<p><b>• rje_uniprot:</b> <i>Updated from Version 3.13.</i>
<br>→ Version 3.14: Added direct retrieval of UniProt entries from URL, including full proteomes. Updated output file naming.
<br>→ Version 3.14: Added dblist=LIST and dbsplit=T/F for additional DB link output control. Set unipath default to url.
</p>
<p><b>• rje_xml:</b> <i>Updated from Version 0.1.</i>
<br>→ Version 0.2: Added parsing from URL.
</p>
<p><b>• rje_xref:</b> <i>Updated from Version 0.0.</i>
<br>→ Version 1.0: Added xfrom and xto fields and xMap() function for mapping from one ID set to another.
</p>
seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-2796107696360309372013-08-21T09:30:00.000+01:002013-08-21T17:21:54.584+01:00External Components of SeqSuite<p>In addition to the python modules included in the <a href="http://seqsuite.blogspot.com/2013/07/availability-installation-and-setup.html">main downloads</a>, some of the programs make use of the additional published programs. Wherever possible, these are freely available for downloading and installing. It is recommended that the user downloads and installs these programs according to the instructions given on the appropriate website. </p>
<h3>Common programs</h3>
<p>Some of the more common programs are listed below. The websites and instructions listed are subject to change, so it is advisable to Google for updated information if in doubt.
</p>
<p><strong>ALIGN:</strong> This is part of the Fasta package (Pearson, 1994; Pearson, 2000) and can be downloaded from the <a href="http://fasta.bioch.virginia.edu/">University of Virginia</a>. Make sure that align is part of the download. For some reason it seems to have been dropped from later packages. You may need to install an earlier package first (e.g. 2.1) and then a later package. <em>ALIGN is <strong>not</strong> a core component of any SeqSuite program and need not be installed.</em></p>
<p><strong>BLAST(+):</strong> BLAST (Altschul, et al., 1990) and BLAST+ are freely available for <a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download">download from NCBI</a>. BLAST has now largely been superseded by BLAST+ but some programs are still restricted to BLAST at the moment. Other tools can be made to use BLAST using <code>oldblast=T</code>.</p>
<p><strong>CLUSTALW:</strong> ClustalW (Higgins and Sharp, 1988; Thompson, et al., 1994) is an old stalwart for bioinformatics and is freely available from EMBL: <a href="ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalW/">ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalW/</a>. Note that CLUSTALW is used as a backup for ClustalO (below) and to draw trees. See <em>Replacing Components with Other Programs</em> (below) for details of how to incorporate other tree-drawing packages.</p>
<p><strong>CLUSTAL Omega:</strong> CLUSTALO is a newer multiple alignment program from the Clustal team, available from <a href="http://www.clustal.org/omega/">clustal.org</a>. (See below for more multiple alignment options.)</p>
<blockquote>
<p>"The last alignment program you'll ever need."</p>
</blockquote>
<p><strong>R:</strong> The statistical programming language, R, is used for PNG visualisation by some SeqSuite programs. R is freely available from: <a href="http://cran.r-project.org/">http://cran.r-project.org/</a>. Note that some installations of R can require a bit of tweaking of the R scripts provided (in <code>libraries/r/</code>). Please email <code>seqsuite@gmail.com</code> if you require some help with this and/or have problems with the R-coded PNG visualisations.</p>
<p>It is recommended that paths to these programs are placed into an INI file (see <a href="http://seqsuite.blogspot.co.uk/2013/08/command-line-options.html">Command-line Options</a>). These can usually be replaced with different programs if desired (<em>Replacing Components with Other Programs</em>).</p>
<h3>Replacing Components with Other Programs</h3>
<p>The most important functions performed by the external programs alignment and tree-drawing. This section lists some ways to incorporate alternative programs for these functions into RJE programs. I am always interested to add more functionality, so if there is a program you would like to use instead of those listed, then please contact me and I may be able to add them in a more controlled fashion than below.</p>
<h4>Alignment programs</h4>
<p>By default, Clustal Omega is used for alignments as I have found this to be both fast and accurate. There can be problems with memory allocation for larger datasets and so and ClustalW (Higgins and Sharp, 1988; Thompson, et al., 1994) is used for large datasets above a certain total number of residues (as determined by the <code>cwcut=X</code> parameter). Either of these programs can be replaced, however, by another program that uses the same command-line format call the programs.</p>
<p>For ClustalW, the system call is:</p>
<pre><code>clustalw INFILE
</code></pre>
<p>where INFILE is in fasta format (<code>*.fas</code>) and the output file (<code>*.aln</code>) is in ClustalW align format. The path to ClustalW can be changed to redirect to another program using the <code>clustalw=COMMAND</code> option. (This maybe written as <code>clustalw=PATH</code> in places but the full path including the clustalw program should be given.)</p>
<p>The following alignment program options can currently be used with SeqSuite programs:</p>
<pre><code>clustalw=COMMAND : Path to CLUSTALW program ['clustalw']
clustalo=COMMAND : Path to CLUSTAL Omega program ['clustalo']
mafft=COMMAND : Path to MAFFT alignment program ['mafft']
muscle=COMMAND : Path to MUSCLE ['muscle']
fsa=COMMAND : Path to FSA alignment program ['fsa']
pagan=COMMAND : Path to PAGAN alignment program ['pagan']
alnprog=X : Choice of alignment program to use (clustalw/clustalo/muscle/mafft/fsa/pagan) [clustalo]
</code></pre>
<p>Any of these could be replaced with another script or program with the same input/output. For example, <code>muscle=PATH</code> could be used to redirect to any program using the system: <code>program -in INFILE -out OUTFILE</code>, where INFILE and OUTFILE are both fasta format. (Remember to set <code>alnprog=muscle</code>.)</p>
<h4>Tree-drawing programs</h4>
<p>The default for SeqSuite programs is to use the Neighbour-joining method implemented in ClustalW for drawing trees. Although this is not the most accurate phylogeny construction algorithm around, it is fast and efficient and reasonable for trees of closely-related sequences with high bootstrap support,
such as those HAQESAC was designed to build and work with. </p>
<p>Again, this program can be replaced with another using the <code>maketree=PATH</code> option. The system call used is:</p>
<pre><code>clustalw -infile=INFILE -bootstrap=X -seed=X [-kimura]
</code></pre>
<p>for UNIX, or</p>
<pre><code>clustalw INFILE -bootstrap=X -seed=X [-kimura]
</code></pre>
<p>for Windows, where <code>INFILE</code> is in fasta format (<code>*.fas</code>) and the output file (<code>*.phb</code>) is in bootstrapped Phylip format (I think).</p>
<p>It should work to have a program output a Newick Standard Format tree as <code>*.nsf</code> but I have not tested that.
Phylip tree-drawing is also implemented. See <code>rje_tree</code> module documentation for details. Other phylogenetics programs can be added on request - anything able to generate Phylip or Newick format trees should be easy to add.</p>
<h4>Wrapper scripts</h4>
<p>If the chosen program does not accept the same input/output commands/formats then a wrapper script should be written. It is suggested to use Perl or Python for this. Although I cannot promise help in every suggestion, you are welcome to e-mail me for help with this and I will see what I can do.</p>
<h4>Incorporating Other Programs into the Python Code</h4>
<p>If you are feeling brave, you can actually edit the Python modules themselves. The key methods for this are <code>rje_seq.muscleAln()</code>, <code>rje_seq.clustalAln()</code> and <code>rje_tree.makeTree()</code>. Obviously, I cannot promise to give technical support for any changes that are made but, if you know what you are doing, you should be OK and I will help where I can.</p>
<h3>References</h3>
<p><em>This reference list needs completing but references for the older software listed include:</em></p>
<ul>
<li>Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ (1990). Basic local alignment search tool. J Mol Biol, 215: 403-410.</li>
<li>Edgar RC (2004). MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5: 113.</li>
<li>Higgins DG and Sharp PM (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene, 73: 237-244.</li>
<li>Pearson WR (1994). Using the FASTA program to search protein and DNA sequence databases. Methods Mol Biol, 24: 307-331.</li>
<li>Pearson WR (2000). Flexible sequence similarity searching with the FASTA3 program package. Methods Mol Biol, 132: 185-219.</li>
<li>Thompson JD, Higgins DG and Gibson TJ (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 22: 4673-4680.</li>
</ul>
seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-50263334255434466072013-08-20T10:10:00.000+01:002013-08-21T17:08:28.974+01:00Command-line Options<p>The behaviour of all of the programs is subject to modification via the setting of command-line options. Some of these are generic and apply to most/all SLiMSuite programs - see the <code>rje.py</code> documentation for these, or the section below - whereas others are program specific.</p>
<h3>Setting commandline options</h3>
<p>Commandline options have two parts: the <em>argument</em> and the <em>value</em>. These can be fed to programs in one of two formats:</p>
<pre><code>argument=value
-argument value
</code></pre>
<p>These two lines have equivalent functions. The two styles can be mixed within a program call, <em>e.g.</em></p>
<pre><code>python program.py arg1=val1 -arg2 val2
</code></pre>
<p>Options can also be supplied within <code>*.ini</code> files (see below).</p>
<h3>Option Types</h3>
<p>There are essentially three types of command-line option:</p>
<ol>
<li>Those that require a value (numerical or text), <code>option=X</code>. Those that require a filename as the value will be witten: <code>option=FILE</code>. Those that require a directory path as the value will be witten: <code>option=PATH</code>. Those that lead to an accessory application (rather than just its path) may also be listed as <code>option=COMMAND</code>. Paths and filenames should always use forward slash (<code>/</code>) separators, whatever the operating system.</li>
<li>
True/False (On/Off) options, <code>option=T/F</code>. For these options:
<ul>
<li><code>option=F</code> and <code>option=False</code> are the same and turn the option off.</li>
<li><code>option</code> (or <code>-option</code>), <code>option=T</code> and <code>option=True</code> are the same and turn the option on.</li>
</ul>
</li>
<li>List options. These are like the value options but have multiple values, separated by commas: <code>option=X,Y</code>. Where <code>..</code> is used, the number elements is optional, e.g. <code>option=X,Y,..,Z</code> could take <code>option=X</code> or <code>option=A,B,C,D</code>. Where <code>option=LIST</code> is used, the number of elements is optional and <code>LIST</code> could actually be the name of a file containing the list of elements.</li>
</ol>
<h3>Long option values, whitespace and special characters</h3>
<p>Some characters, such as whitespace, commas, pipes (“|”) and ampersands, will be interpreted by UNIX in particular ways from the commandline. If you have such characters within the option value, then either place the settings in an INI file (see below) or enclose the option value in quotes. If the value contains whitespace, double quotes will be needed even within an INI file, as whitespace is used to delimit commandline options, <em>e.g.</em></p>
<pre><code>python program.py option="Two words" limits="2,3"
</code></pre>
<p>NB. For PATH variables, directories should be separated by a forward slash (<code>/</code>). If paths contain spaces, they <strong>must</strong> be enclosed in double quotes: </p>
<pre><code>path="example path".
</code></pre>
<p>It is recommended that paths do not contain spaces as function cannot be guaranteed if they do.</p>
<a name="INI"><h3>INI Files</h3></a>
<p>As well as feeding commands in on the command-line, any options listed can also be save in a plain text file and called using the option <code>ini=FILE</code>. The precedence of loading default run settings from <code>ini</code> files is slightly complex but (hopefully) makes sense once it is clear that there is two kinds of precedence being invoked:</p>
<ol>
<li>
For each <strong><code>ini</code> file</strong> there is a directory precedence determining where to look for that file. Once the file is found, commands from that file will be read in and the program will <em>stop looking for other versions of the file</em>. Each <code>ini</code> file is looked for:
<ul>
<li>in the current directory from which the run command is being executed</li>
<li>the directory containing the program being run. (Under usual circumstances, it is not recommended to put <code>ini</code> files in these directories, using instead:</li>
<li>the <code>settings/</code> directory of the distribution. This is the recommended location for default <code>ini</code> files and universal default values for all runs should be put here.</li>
</ul>
</li>
<li>
For each <code>ini</code> file that <em>is</em> read in, each <strong>command</strong> has a setting precedence as described below, such that later values will over-rule earlier values for the same argument. Default <code>ini</code> files (<em>if present</em>) are read in the following order:
<ul>
<li>Global defaults are read from a <code>defaults.ini</code> file. (This is recommended.)</li>
<li>System defaults are read from an <code>rje.ini</code> file. (This file is not recommended and is largely for development reasons.)</li>
<li>Program defaults are read from the file named after the program (<em>e.g.</em> <code>haqesac.ini</code> for HAQESAC). (This will be the same root filename as the default <code>*.log</code> file if you are not sure.)</li>
</ul>
</li>
</ol>
<p>For example, if you are running <code>haqesac.py</code> in a directory containing <code>haqesac.ini</code>, the full list of commandline arguments will be any in <code>PATH/settings/defaults.ini</code> (if it exists) plus any in <code>PATH/settings/rje.ini</code> (if it exists) plus the contents of <code>./haqesac.ini</code> plus the options given on the commandline. If, on the other hand, there is no <code>./haqesac.ini</code> file, options will instead be read from <code>PATH/settings/haqesac.ini</code> (if it exists). (The <code>PATH/</code> is determined using the path given to the <code>haqesac.py</code>.) If any of these files have been placed in <code>tools/</code> instead (<em>not</em> recommended), these will be used in place of those from <code>settings/</code>.
</p>
<p>It is recommended that a <code>defaults.ini</code> file is made and placed in the <code>settings/</code> directory. This file should contain the paths to the External Programs used by RJE programs:</p>
<pre><code>blastpath=PATH
blast+path=PATH
fastapath=PATH
clustalw=COMMAND
muscle=COMMAND
</code></pre>
<p>Note that the first three are just paths to the programs, while for ClustalW and MUSCLE the actual program commands themselves must be included. This is to make it easier to replace these programs with alternatives. <!--[Help section link to be added.]--></p>
<p>If running in windows, it is also advisable to add the <code>win32=T</code> command to the <code>defaults.ini</code> file.</p>
<h4>INI File formatting</h4>
<p>INI files are simple plain text files. Several commands can be put on a single line, although it is generally clearer to stick to one command per line. Any text on a line following a hash (<code>#</code>) will be treated as a comment and ignored unless it is part of an option value in double quotes. This allows INI files to be documented.</p>
<h3>Option Precedence</h3>
<p>Later options will supersede earlier ones if they are mutually exclusive. Options from an INI file will be inserted into the list at the point the <code>ini=FILE</code> command is called. (Default <code>*.ini</code> files are read in the order listed above, <em>i.e.</em> options from the <code>defaults.ini</code> file are read first, followed by the <code>program.ini</code> file.) This means that ini file options can be over-ruled, e.g. <code>program.py ini=eg.ini i=1</code> will supersede any interactivity setting in <code>eg.ini</code> with <code>i=1</code>, whereas <code>program.py i=1 ini=eg.ini</code> will use any interactivity setting in <code>eg.ini</code> and over-rule <code>i=1</code>.</p>
<h3>Interactivity and Verbosity settings</h3>
<p>By default, the programs are generally setup to run through to completion without any user-interaction if given all the options it needs. For more interaction with the program as it runs, use the argument <code>i=1</code>.</p>
<pre><code>python xxx.py commandlist i=1
</code></pre>
<p>Both the level of interactivity and the amount printed to screen can be altered, using the interactivity <code>i=X</code> and verbosity <code>v=X</code> command-line options, respectively, where <code>X</code> is the level from none (-1) to lots (2+). Although in theory <code>i=-1</code> and <code>v=-1</code> will ask for nothing and show nothing, there is a chance that some print statements will have escaped in these early versions of the program. There is also the possibility that accessory programs may print things to the screen beyond the control of the calling program. Please report any that you spot!</p>
<p>Please report any irritations and suggestions for changes to what is printed at different verbosity levels.</p>
<h3>General Command-line Options</h3>
<p>Along with the some of the options listed above, there are a number of core options that are used in many or all of the SLiMSuite programs. Defaults are given in square brackets. </p>
<blockquote>
<p><strong>NOTE:</strong> Default settings might vary between programs. To set global defaults, it is recommended to put these options in the <code>defaults.ini</code> file.</p>
</blockquote>
<h4>Help and Program Logs</h4>
<pre><code>help : Prints help documentation to screen.
v=X : Sets screen verbosity (-1 for silent) [0]
i=X : Sets interactivity (-1 for full auto) [0]
silent=T/F : If set to True will not write to screen or log. [False]
log=FILE : Redirect log to FILE [program.log]
newlog=T/F : Create new log file. [False]
errorlog=FILE : If given, will write errors to an additional error file. [None]
</code></pre>
<h4>General Input/Output Options</h4>
<pre><code>outfile=FILE : This will set the 'root' filename for (non-log) output files in most programs (FILE.*) [None]
basefile=FILE : Equivalent of log=FILE outfile=FILE. [None]
force=T/F : Force to regenerate data rather than keep old results. [False]
append=T/F : Append to results files rather than overwrite. [False]
backups=T/F : If True, option given to backup certain files if append=F. [True]
delimit=X : Sets standard delimiter for results output files. [varies]
mysql=T/F : “MySQL output” with lowercase headers that lack spacers. (Not all programs) [False]
</code></pre>
<h4>System settings</h4>
<pre><code>win32=T/F : Run in Win32 Mode for Windows operation. [False]
memsaver=T/F : Run in “Memory Saver” mode. Varies with program. [False]
runpath=PATH : Run program as if in given path (log files and some programs only) [PATH called from]
rpath=COMMAND : Path to installation of R. ['R']
maxbin=X : Maximum number of trials for using binomial (else use Poisson) [∞]
</code></pre>
<h4>Forking Options</h4>
<pre><code>forks=X : Number of forks. (Some programs only.) [0]
killforks=X : Number of seconds of inactivity before killing forks. [3600]
noforks=T/F : Over-ride and cancel forking if True. [False]
</code></pre>
<p>This information is also available by printing the <code>__doc__</code> attribute of the <code>rje.py</code> module at a Python prompt (<code>print rje.__doc__</code>), or using the help option: <code>python rje.py help</code>. Please contact me if you want any further details of a specific option and/or advice as to when (not) to use it.</p>
seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-36482664578806576272013-08-06T10:02:00.000+01:002013-08-06T10:05:20.133+01:00Updated programs coming soon...SLiMSuite and Seqsuite have been undergoing some tidying and additional tweaks, such as implementing BLAST+ in most programs. The documentation is also undergoing a bit of an overhaul (see the <b>Documentation</b> links in the left sidebar) and so the distribution of the latest code is being held back for a while. If you want access to the latest versions, however, feel free to get in touch. (Particularly if you want to use BLAST+ with <a href="http://seqsuite.blogspot.co.uk/2012/05/slimsuite-servers-and-programs.html">SLiMSuite</a> or HAQESAC.)seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-83782349435289638542013-08-01T21:58:00.000+01:002013-08-01T21:58:37.457+01:00New look Bioware<p>The <a href="http://bioware.ucd.ie">Bioware server</a> has a new(ish!) look! The function of the tools should be much the same (although various updates are in progress) but the feel of the site should hopefully be cleaner and more consistent on mobile devices. Feedback welcome!</p>
<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqlhv6PulcmX3jFfRIjckd3GGY2lkbc0z4kaf249VGslW4lzsTNkObIsfUAMbNzLIDa__GRYwnQ_vEQlLL5E4KC81GtYWsnvdxFsvev7zLT8a3UjFbPXQAqM2NBAjXLt82R9ufJ85FX41a/s1600/Bioware+Screen+Shot.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqlhv6PulcmX3jFfRIjckd3GGY2lkbc0z4kaf249VGslW4lzsTNkObIsfUAMbNzLIDa__GRYwnQ_vEQlLL5E4KC81GtYWsnvdxFsvev7zLT8a3UjFbPXQAqM2NBAjXLt82R9ufJ85FX41a/s400/Bioware+Screen+Shot.png" width="100%" /></a></p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-86235284854058606812013-08-01T10:29:00.000+01:002013-08-21T15:20:47.133+01:00Availability, Installation and Setup<p>SLiMSuite and Seqsuite are currently available from <a href="http://bioware.soton.ac.uk">http://bioware.soton.ac.uk</a> as three packages: </p>
<ol>
<li><strong>SLiMSuite</strong> contains software for Short Linear Motif (SLiM) analysis.</li>
<li><strong>SeqSuite</strong> contains all of the SLiMSuite programs plus some additional sequence analysis programs.</li>
<li><strong>RJESuite</strong> contains SLiMSuite, SeqSuite and a bunch of other miscellaneous utilities and bits and bobs.</li>
</ol>
<p>In future, it is envisaged that a single Git repository will contain all the relevant code and documentation.</p>
<p>All three packages have the same basic installation, directory structure and setup requirements. For basic functionality, no other setup should be necessary beyond downloading and unzipping the package in the desired directory if Python is installed on your system. Some programs will need to use external components or accessory applications, which may need additional installation.</p>
<p>If you do not have Python, you can download it free from www.python.org at <a href="http://www.python.org/download/">http://www.python.org/download/</a>. The modules are written in Python 2.x and most have been tested with 2.7. The Python website has good information about how to download and install Python but if you have any problems, please get in touch and I will help if I can.</p>
<p>All the required files should have been provided in the download zip file. The Python Modules are open source and may be changed if desired, although please give me credit for any useful bits you pillage. I cannot accept any responsibility if you make changes and the program stops working, however! If you want some help understanding the way the modules and classes are set up so you can edit them, just contact me.</p>
<h3 id="directory-structure">Directory Structure</h3>
<p>Once unzipped, the download will unpack a top level <code>seqsuite/</code> or <code>slimsuite/</code> directory with the following subdirectories:</p>
<p><code>data/</code> contains example data for testing programs. (Currently under development.)</p>
<p><code>docs/</code> contains documentation.</p>
<p><code>extras/</code> contains accessory programs that are not part of the main program suite.</p>
<p><code>legacy/</code> contains superseded programs that are no longer supported. (Currently under development.)</p>
<p><code>libraries/</code> contains all the python libraries used by the main tools (and extras), some of which have standalone functionality.</p>
<p><code>settings/</code> contains INI files set default options.</p>
<p><code>tools/</code> contains the main program suite.</p>
<blockquote>
<p><strong>NOTE:</strong> It is recommended that analyses are performed outside these directories for ease of reinstallation.</p>
</blockquote>
<h3 id="third-party-software">Third party software</h3>
<p>Many of the tools make use of third party software. Where possible, instructions will be provided for obtaining these programs but a quick Google is usually sufficient - wherever possible, third party software is free for academic use and (ideally) open source.</p>
<p>When third party software is used, SeqSuite will also need to the path to the program, or suite of programs. This will be covered more in the <strong>Command-line Options</strong> section but <strong>BLAST</strong> and <strong>clustalw</strong> deserve a special mention as examples because many of the programs use these as default programs for certain functions.</p>
<p><strong>BLAST</strong> is actually a suite of programs and the path containing these executables should be provided using `blastpath=PATH/', <em>e.g.</em>:</p>
<pre><code>blastpath=/usr/ncbi/bin/
</code></pre>
<p>For BLAST, do <em>not</em> give the full path to the program (<em>e.g.</em> <code>blastpath=/usr/ncbi/bin/blastp</code>). BLAST cannot be replaced easily by other programs. BLAST has now largely been superseded by BLAST+, which needs its own path parameter:</p>
<pre><code>blast+path=PATH
</code></pre>
<p>Some programs are still restricted to BLAST at the moment and other tools can be made to use the BLAST with the <code>oldblast=T</code> switch.</p>
<p><strong>Clustalw</strong> is a useful standalone program that is used as a default for alignments and trees in the absence of newer (better) programs. For this, and other single executables, the full path to the program is given:</p>
<pre><code>clustalw=/usr/bioware/clustalw1.83/clustalw
</code></pre>
<p>In these situations, a different program with the same input and output can be substituted.</p>
<blockquote>
<p><strong>NOTE:</strong> Remember to set the relevant paths in an appropriate <code>*.ini</code> file in <code>settings/</code>. Where possible, error messages will identify issues with third party software but due to a lack of testing on a diversity of systems, this is not always possible. If a program crashes, please check the <code>*.log</code> file for signs that there may be a problem with the installation and/or path given for third party programs, such as BLAST.<br />
</p>
</blockquote>
<h3 id="upgrading">Upgrading</h3>
<p>At present, each upgrade is distributed as a separate package. You can check the current version by the date in the name of the distribution file (in <a href="http://xkcd.com/1179/">ISO 8601</a> standard, <code>YYYY-MM-DD</code> format). Plans are afoot to switch to a Git repository, which will make upgrades easier.</p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-36224411271360829662013-07-29T16:59:00.000+01:002013-07-29T16:59:40.063+01:00Getting Help<p>Much of the information here is also contained in the documentation of the Python modules themselves. A full list of command-line parameters can be printed to screen using the <code>help</code> option, with short descriptions for each one:</p>
<pre><code>python program.py help
python program.py -help
python program.py -h</code></pre>
<p>Details of command-line options specific to each program can also be found in the distributed <code>readme.txt</code> and <code>readme.html</code> files.</p>
<p>If stuck, or something is unclear, then please e-mail me (<code>seqsuite@gmail.com</code>) whatever question you have. If it is the results of an error message, then please send me that and the log file too.</p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-45417417581207587612013-07-17T17:23:00.001+01:002013-07-17T17:23:30.177+01:00SLiMScape: a protein short linear motif analysis plugin for Cytoscape. <p>New paper published!</p>
<p>O’Brien KT, Haslam NJ & Shields DC (2013). <a href="http://www.biomedcentral.com/1471-2105/14/224">SLiMScape: a protein short linear motif analysis plugin for Cytoscape</a>. <em>BMC Bioinformatics</em> <strong>14(1):</strong>224. [Epub ahead of print]</p>
<blockquote>
<p><strong>BACKGROUND:</strong> Computational protein short linear motif discovery can use protein interaction information to search for motifs among proteins which share a common interactor. Cytoscape provides a visual interface for protein networks but there is no streamlined way to rapidly visualize motifs in a network of proteins, or to integrate computational discovery with such visualizations.</p>
<p><strong>RESULTS:</strong> We present SLiMScape, a Cytoscape plugin, which enables both de novo motif discovery and searches for instances of known motifs. Data is presented using Cytoscape’s visualization features thus providing an intuitive interface for interpreting results. The distribution of discovered or user defined motifs may be selectively displayed and the distribution of protein domains may be viewed simultaneously. To facilitate this SLiMScape automatically retrieves domains for each protein.</p>
<p><strong>CONCLUSION:</strong> SLiMScape provides a platform for performing short linear motif analyses of protein interaction networks by integrating motif discovery and searchtools in a network visualization environment. This significantly aids in the discovery of novel short linear motifs and in visualizing the distributionof known motifs.</p>
</blockquote>
<p>PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/23855714?dopt=Abstract">23855714</a></p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-13753524747428075632013-07-13T22:58:00.001+01:002013-07-13T22:58:54.737+01:00SLiMSuite at the OMICS Group 3rd International Conference on Proteomics & Bioinformatics<p>If anyone is attending the OMICS Group <a href="http://www.omicsgroup.com/conferences/proteomics-bioinformatics-2013/">3rd International Conference on Proteomics & Bioinformatics</a> this week then be sure to say hello. I am <a href="http://www.omicsgroup.com/conferences/proteomics-bioinformatics-2013/scientific-programme.php?day=3&sid=14&date=2013-07-17">speaking on the last day</a> in the “Computational Biology” track.. (Never the best time to talk at a conference as there is limited time for follow up but at least it is before lunch!) </p>
<h2 id="slimpickings:miningstructuralandsequencedataforthepredictionofshortlinearproteininteractionmotifs">SLiM Pickings: mining structural and sequence data for the prediction of short linear protein interaction motifs</h2>
<blockquote>
<p>Short Linear Motifs (SLiMs) are short functional protein sequences that act as ligands to mediate transient protein-protein interactions (PPI) in critical biological pathways and signaling networks. SLiMs are short (3-15aa), generally tolerate considerable sequence variation and typically have fewer than five residues critical for function. These features result in a degree of evolutionary plasticity not seen in domains and SLiMs often add new functions to proteins by convergent evolution. They also present a challenge for computational identification, making it difficult to differentiate biological signal from stochastic patterns. Despite this, discovering new SLiMs is of great interest due to their potential as therapeutic targets. </p>
<p>In recent years, we have made great progress in SLiM discovery, particularly through development of the SLiMSuite package of bioinformatics tools. SLiMs generally occur in structurally disordered regions of proteins and exhibit evolutionary conservation relative to other disordered residues. SLiMFinder uses this knowledge and exploits patterns of convergent evolution to predict novel, over-represented motifs within a statistical framework with high specificity. Applying this approach to a comprehensive set of human PPI data has highlighted interactome complexity and quality as the next challenges for SLiM prediction. Our latest development, QSLiMFinder (“Query” SLiMFinder) tackles some of these issues by incorporating specific interaction data to restrict the motif search space, which improves both the sensitivity and biological relevance of predictions. We are now using QSLiMFinder to combine structurally defined domain-motif interactions with large-scale PPI data to perform large-scale <i>de novo</i> SLiM prediction.</p>
</blockquote>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-60465592057785558202013-07-10T16:41:00.000+01:002013-08-01T18:42:41.259+01:00Documentation<p>SLiMSuite and SeqSuite have grown into rather unwieldy beasts since their origins as individual programs and the documentation has struggled to keep up. In particular, the original plan of a single PDF manual per program is getting creaky. Because of the shared reliance on common modules, multiple programs make use of the same sets of options for alignments and conservation scoring <i>etc.</i> and propagating tweaks and modifications through all the manuals can be a bit head-wrecking.</p>
<p>As a result of all of this, the documentation currently undergoing a bit of a review and rethink. I am still keen to keep the PDF manuals (as I think they are useful) but will be working through an intermediate phase of online Markdown/HTML documentation of some kind. The current plan is to trickle out draft copies via the blog and then probably release a Git repository once sufficiently populated.</p>
<p>In the meantime, I would be interested to hear any thoughts regarding favoured documentation styles etc. (e.g. HTML vs PDF, large files vs small chunks) as well as bits that are particularly unclear or in need of attention.</p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-56855530377103729912013-07-08T13:50:00.003+01:002013-07-10T10:29:06.197+01:00New Software Release<p>New releases of <a href="http://www.southampton.ac.uk/~re1u06/software/packages/seqsuite/">SeqSuite</a>, <a href="http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/">SLiMSuite </a>and <a href="http://www.southampton.ac.uk/~re1u06/software/packages/rjesuite/">RJESuite </a>are now available.</p>
<p>The biggest change since the last release is the renaming of <b>SLiMSearch </b>to <b>SLiMProb</b>. This is to avoid confusion between the old <b>SLiMSearch 1.x</b> (now <b>SLiMProb</b>) and the newer SLiMSearch 2.x webserver, which has a different range of functions.</p>
<br>
<H3>Updates since last release:</H3>
<p><b>• cpppred:</b> <i>Created.</i>
</p>
<p><b>• gopher:</b> <i>Updated from Version 3.1.</i>
<br>→ Version 3.2: Minor tweak to prevent unwanted directory generation for programs using existing GOPHER alignments.
<br>→ Version 3.3: Added rje_blast_V2 to use BLAST+. Run with legacy=T to stick with old NCBI BLAST. Started utilising rje_seqlist.
</p>
<p><b>• pepbindpred:</b> <i>Created.</i>
</p>
<p><b>• slimprob:</b> <i>Created.</i>
<br>→ Version 1.0: SLiMProb 1.0 based on SLiMSearch 1.7. Altered output files to be *.csv and *.occ.csv.
</p>
<p><b>• file_monster:</b> <i>Updated from Version 2.0.</i>
<br>→ Version 2.1: Added dirsum function.
</p>
<p><b>• rje:</b> <i>Updated from Version 4.5.</i>
<br>→ Version 4.6: Added dev and warn options.
</p>
<p><b>• rje_blast_V2:</b> <i>Created.</i>
<br>→ Version 2.0: Initial Compilation from rje_blast_V1 V1.14.
<br>→ Version 2.1: Tweaking code to work with GOPHER 3.x - removing self.info etc. Added blastObj() method.
</p>
<p><b>• rje_db:</b> <i>Updated from Version 0.4.</i>
<br>→ Version 0.5: Initial coding of index mode. (Not yet fully functional.)
<br>→ Version 1.0: Working, so upgraded to version 1.0!
</p>
<p><b>• rje_obj:</b> <i>Updated from Version 0.0.</i>
<br>→ Version 1.0: Fully working version, so upgraded to 1.0. Added dev and warn options.
</p>
<p><b>• rje_seq:</b> <i>Updated from Version 3.15.</i>
<br>→ Version 3.16: Added BLAST+ path and seqFromBlastDBCmd()
</p>
<p><b>• rje_slimcalc:</b> <i>Updated from Version 0.5.</i>
<br>→ Version 0.6: Minor tweak to avoid unwanted GOPHER directory generation.
<br>→ Version 0.7: Added RLC to "All" conscore running.
</p>
<p><b>• rje_slimcore:</b> <i>Updated from Version 1.9.</i>
<br>→ Version 1.10: Bypass UPC generation for single sequences.
</p>
<p>Documentation is still in the process of development. BLAST+ implementation is ongoing - please get in touch if this is something you need.</p>seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0tag:blogger.com,1999:blog-4465803113249328154.post-79002095645122038182013-04-29T16:19:00.001+01:002013-05-01T16:11:45.595+01:00Second QSLiMFinder poster now on F1000 Posters<a href="http://f1000.com/posters/browse/summary/1093024" ><img border="0" src="http://cdn.f1000.com/posters/thumbnails/253461652" align="right" /></a>The second <a href="http://seqsuite.blogspot.co.uk/2013/04/latest-qslimfinder-poster-now-on-f1000.html">QSLiMFinder poster</a> from the recent Cold Spring Harbor Laboratory "<a href="http://seqsuite.blogspot.co.uk/2013/03/qslimfinder-at-cold-spring-habor.html">Systems Biology: Networks</a>" meeting is now available at F1000 Posters:
<blockquote><li>Edwards RJ & Palopoli N. <a href="http://f1000.com/posters/browse/summary/1093024">Computational prediction of short linear motifs mediating host-pathogen protein-protein interactions</a>.</blockquote>(I'm not sure why the <a href="http://seqsuite.blogspot.co.uk/2013/04/latest-qslimfinder-poster-now-on-f1000.html">last post</a> about the other poster disappeared for a few days but it's back now!seqsuitehttp://www.blogger.com/profile/09978269566690653844noreply@blogger.com0