Disk-covering method

Disk-covering method

A disk-covering method is a divide-and-conquer meta-technique for large-scale phylogenetic analysis which has been shown to improve the performance of both heuristics for NP-hard optimization problems and polynomial-time distance-based methods. Disk-covering methods are a meta-technique in that they have flexibility in several areas, depending on the performance metrics that are being optimized for the base method. Such metrics can be efficiency, accuracy, or sequence length requirements for statistical performance. There have been several disk-covering methods developed, which have been applied to different "base methods". Disk-covering methods have been used with distance-based methods (like neighbor joining) to produce "fast-converging methods"[1] [2][3], which are methods that will reconstruct the true tree from sequences that have at most a polynomial number of sites.

A disk-covering method has four steps:

  1. Decomposition: Compute a decomposition of the dataset into overlapping subsets.
  2. Solution: Construct trees on the subsets using a base method.
  3. Merge: Use a supertree method to merge the trees on the subsets into a tree on the full dataset.
  4. Refinement: If the tree obtained in the merge is not fully resolved, then resolve it further into a binary tree so that it optimizes some desired objective criterion.


The major use of any disk-covering method is that of the "Rec-I-DCM3" disk-covering method [4], which has been used to speed-up maximum likelihood and maximum parsimony analyses, and are available through the NSF-funded CIPRES project (www.phylo.org). However, disk-covering methods have also been used for estimating evolutionary trees from gene order data [5]


References

  1. ^ D. Huson, S. Nettles and T. Warnow. (1999). Disk-covering, a fast-converging method for phylogenetic tree reconstruction. Journal of Computational Biology, 6:369-386.
  2. ^ L. Nakhleh, U. Roshan, K. St. John, J. Sun and T. Warnow. (2001). Designing fast converging phylogenetic methods. In Proc. 9th Int'l Conf. on Intelligent Systems for Molecular Bioology (ISMB '01), volume 17 of Bioinformatics, pp S190-S198. Oxford U. Press.
  3. ^ T. Warnow, B. Moret and K. St. John. (2001). Absolute convergence: True trees from short sequences. In Proc. 12th Ann. ACM-SIAM Symp. Discrete Algorithms (SODA '01), pp. 186-195. SIAM Press, 2001.
  4. ^ U. Roshan, B.M.E. Moret, T. Warnow and T.L. Williams. (2004). Rec-I-DCM3: a fast algorithmic technique for reconstructing large phylogenetic trees. In Proceedings of the IEEE Computational Systems Bioinformatics conference (CSB), Stanford, California, USA.
  5. ^ * J. Tang and B. Moret. (2003). Scaling up accurate phylogenetic reconstruction from gene-order data. In Proc. 11th Int'l Conf. on Intelligent Systems for Molecular Biology ISMB '03, volume 19 (Suppl. 1) of Bioinformatics, pp i305 - i312.
  • T. Warnow. 2005. Large-scale phylogenetic reconstruction. Book chapter, in S. Aluru (editor), Handbook of Computational Biology, Chapman & Hall, CRC Computer and Information Science Series, December 2005.