99 and 0 12, making the higher score related to mutation and lowe

99 and 0.12, making the higher score related to mutation and lower to SNP. We used single protein tool SIFT sequence, with default values of median conservation of sequences (3.0). The PSI-BLAST search was applied on UniRef90 database, and sequences with the similarity level of 90% or more to the query sequence were removed from the alignment. Binary classification mostly was done by annotating AAS with SIFT score(org) <0.05 as mutation and AAS with SIFT score(org) >0.05 as SNP.PolyPhen-2 bases its predictions of damaging effects of missense mutations on eight sequence-based and three structure-based features, which were selected using machine learning. The functional effect of an amino acid substitution is predicted based on the calculated Na?ve Bayes probabilistic score.

A mutation is classified as probably damaging when the score is above 0.85, possibly damaging when the score is above 0.15, and the remaining as benign. For the binary classification, we adopted cutoff for probabilistic score of 0.5, so substitutions with the score above this cutoff were considered to be mutations and those below the cutoff to be SNPs. We used default values for query options and HumDiv-trained version of PolyPhen-2, as this is recommended for the evaluation of mutations involved in complex phenotypes.2.3. ISM AlgorithmISM uses FT as a mathematical tool to highlight the periodical structural patterns in the protein sequences and assesses the effect of each AAS on sequence and consequently on the correlating biological function of the protein. Procedure, schematically presented in Figure 1, comprises two steps.

The first step includes transformation of amino acid sequence into sequence of numbers by assigning an EIIP value to a matching amino acid (Table 2). EIIP values approximate energy of valence electrons and were calculated for each amino acid using the general model pseudopotential as follows [42]:W=0.25??Z?sin?(1.04��Z?)2��.(1)Z*, that represents the average quasivalence number, is calculated asZ?=1N��i=1mniZi,(2)where Zi is the valence number of the ith atomic component, ni is the number of atoms of the ith component, m is the number of atomic components in the molecule, and N is the total number of atoms. It was previously shown that the periodicity of EIIP distribution along the protein sequence correlates with biological activity of a protein, especially with its specific interactions with ligands and other proteins (reviewed in [16]).

Figure 1Scheme for the ISM procedure.Table 2Abbreviations and EIIP values for amino acids.The Cilengitide second step is the conversion of this sequence of numbers using FT, which is defined n=1,2,��,N2,(3)where x(m) is the mth member of?asX(n)=��m=1Nx(m)e?i2��n(m?1)/N, a given numerical series, N is the total number of points in the series, and X(n) are discrete FT coefficients.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>