PIIM uses the site-frequency spectrum to estimate the population genetic parameters θ=2Nu (where u is the per-site mutation rate) and R=Nr (where r is the exponential growth rate). Version 2 adds the ability to look at the patterns of pairs of sites to estimate ρ=2Nc (where c is the per-site recombination rate). Note that the ρ estimation routine assumes a constant-size population (i.e. R=0). For details on the method, see:
Questions or bug reports may be e-mailed to me (plfjohnson) at my institution (@berkeley.edu; soon to be @emory.edu).
The PIIM distribution includes this documentation and source code for the following programs:
tar xvzf piim.tar.gz
cd piim make release
tar xvzf piim.tar.gz
cd piim ldd piim
Unfortunately, there are as many assembly output formats as assembly programs. As a result, I chose to implement a simple XML format that includes all the information required for PIIM: reads, assembly, information about read pairs and quality scores. The PERL script ace2xml.pl will convert from the output format used by the phrap assembler to the XML format used by PIIM. If you need the gory details, the XML specification is formalized in "piim_r.dtd", and an example can be found in "example.xml".
Please note that ace2xml.pl must find "FF_Index.pm"; the easiest way to make sure this happens is to keep the two files in the same directory. Running './ace2xml.pl' by itself will produce a short description of its usage. Typical usage looks like:
./ace2xml.pl -i my_assembly.ace -q read_qualities.qfaThe quality file should be a single fasta-style file with qualities instead of bases.
Running './piim_r' by itself will produce a short list of the command-line parameters. The typical usage would be something like:
./piim_r -i example.xml -l lk_n50_t0.01_allsmooth -joint-est
Here, we take our input file (example.xml) and search for the maximum likelihood recombination rate using the likelihood in the specified table. Not that the maximum read depth in your input data should be no more than the maximum n in the likelihood table (otherwise those positions will be ignored and a warning issued). The output is the maximum likelihood parameter estimates separated by tabs (θ, ρ, R [growth rate], λ [gene-conversion parameter]):
0.009375 0.0031875 0 500Note that the λ parameter is fixed at 500 (never estimated) and the growth parameter can only estimated if using the -freqspec version of piim (in which case ρ is NOT estimated).
By default, piim uses the Nelder-Mead simplex algorithm to efficiently search parameter space. However, my implementation requires the caching all pairs of SNPs and uses the hard disk extensively. An alternative is to calculate the entire likelihood surface at fixed grid points in a single pass (option "-surface") and then interpolate to find the MLE. The former (default) is generally faster if you are not running anything else on the same computer that uses the hard disk extensively. However, the latter ("-surface") will be faster if you have a multi-core computer and are trying to run multiple instances simultaneously.
Details about command line parameters:
Efficiency & heuristics
Version 1.0 options
Copyright 2009 The Regents of the University of California. All rights reserved. IN NO EVENT SHALL REGENTS BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF REGENTS HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. REGENTS SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE AND ACCOMPANYING DOCUMENTATION, IF ANY, PROVIDED HEREUNDER IS PROVIDED "AS IS". REGENTS HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. This software is distributed under the terms of Version 2 of the GNU General Public License as published by the Free Software Foundation. If you have not received a copy of Version 2 of the GNU General Public License with this program, you may download the GPL from the following url: http://www.gnu.org/copyleft/gpl.html.