Applied Computational Mathematics Division


Up

A Linear Programming Approach to Multiple Sequence Alignment

Fern Hunt
NIST/MCSD/MMG

Tuesday, January 29, 2002 15:00-16:00,
Room 145, NIST North (820)
Gaithersburg
Tuesday, January 29, 2002 13:00-14:00,
Room 4550
Boulder

Abstract: The advent of large scale sequencing of DNA has triggered a massive accumulation of DNA sequence data and data about proteins- the products of the genes encoded by DNA. Methods for applying this information to fields such as law enforcement, biotechnology, medicine or to gain crucial understanding of the biological significance and functionality of genes and proteins depend on a technique known as sequence alignment. An alignment is an array that displays the degree of relatedness of a set of sequences (either DNA or protein) through matching and equivalent substitutions and the insertion of gap characters that are used to maximize the degree of similarity. The quality of the alignment is evaluated by assigning a total "score" or "cost" and the goal is to attain the "best" alignment. The most widely used algorithms for constructing alignments are based on dynamic programming. Unfortunately a direct implementation of these methods is not suitable for aligning a large number of long sequences. In our talk we will discuss an alternative formulation of the alignment problem based on Markov decision theory. Here one seeks to minimize an average or expected cost subject to data-derived constraints. In this setting the problem is equivalent to a linear programming problem which can be solved efficiently.
Contact: A. J. Kearsley

Note: Visitors from outside NIST must contact Robin Bickel; (301) 975-3668; at least 24 hours in advance.