Geometry in Action

Biological Sequence Alignment

Sequence alignment is a problem of taking multiple sequences of characters, representing DNA sequences or proteins, and finding subsequences which match well with each other (according to some measure involving the lengths of the subsequences, the number and lengths of the gaps in them, etc). A common method of performing this is by finding small fragments that match (e.g. by exact string matching techniques) and then stringing the fragments together to form a longer alignment. By considering the positions of the fragments in each string to be coordinates in some Euclidean space, the problem becomes one amenable to orthogonal range searching techniques from computational geometry (and the decomposition of space formed by, for each point, determining the best previous point to chain it to, has strong resemblances to a rectilinear Voronoi diagram). Similar ideas also apply to a problem of comparing sequences of gene markers (where now the coordinates are positions of markers on a chromasome).

Part of Geometry in Action, a collection of applications of computational geometry.
David Eppstein, Theory Group, ICS, UC Irvine.

Semi-automatically filtered from a common source file.