Bioinformatics. Группа авторов

Bioinformatics - Группа авторов


Скачать книгу
sequence using all possible query words, it is possible that more than one HSP may be found for any given sequence pair.

      As one might imagine, assessing the putative biological significance of any given BLAST hit based simply on raw scores is difficult, since the scores are dependent on the composition of the query and target sequences, the length of the sequences, the scoring matrix used to compute the raw scores, and numerous other factors. In one of the most important papers on the theory of local sequence alignment statistics, Karlin and Altschul (1990) presented a formula which directly addresses this problem. The formula, which has come to be known as the Karlin–Altschul equation, uses search-specific parameters to calculate an expectation value (E). This value represents the number of HSPs that would be expected purely by chance. The equation and the parameters used to calculate E are as follows:

equation

      where k is a minor constant, m is the number of letters in the query, N is the total number of letters in the target database, λ is a constant used to normalize the raw score of the high-scoring segment pair, with the value of λ varying depending on the scoring matrix used; and S is the score of the high-scoring segment pair.

      Performing a BLAST Search

Snapshot depicts the National Center for Biotechnology Information BLAST landing page. Snapshot depicts the upper portion of the BLASTP query page in which the first section in the window is used to specify the sequence of interest, whether only a portion of that sequence should be used in performing the search, which database should be searched, and which protein-based BLAST algorithm should be used to execute the query. Snapshot depicts the lower portion of the BLASTP query page, showing algorithm parameters that the user can adjust to fine-tune the search.
Скачать книгу