?? rnaforester.1
字號:
.TH RNAForester 1.4 "June 2004".SH NAMERNAforester \- compare RNA secondary structures via forest alignment.SH SYNOPSIS\fBRNAforester\fP [options].brOptions are:.br--help shows this help info.br--version shows version information.br-d calculate distance instead of similarity.br-r calculate relative score.br-l local similarity.br-so=int local suboptimal alignments within int%.br-s small-in-large similarity.br-m multiple alignment mode.br-mt=double clustering threshold.br-mc=double clustering cutoff.br-p predict structures from sequences.br-pmin=num minimum basepair frequency for prediction.br-pm=int basepair(bond) match score.br-pd=int basepair bond indel score.br-bm=int base match score.br-br=int base mismatch score.br-bd=int base indel score.br--RIBOSUM RIBOSUM85-60 scoring matrix.br-cmin=double minimum basepair frequency for consensus structure.br-2d generate alignment 2D plots in postscript format.br--2d_hidebasenum hide base numbers in 2D plot.br--2d_basenuminterval=n show every n-th base number.br--2d_grey use only grey colors in 2D plots.br--2d_scale=double scale factor for the 2d plots.br--score compute only scores, no alignment.br--fasta generate fasta output of alignments.br-f=file read input from file.br--noscale suppress output of scale.SH DESCRIPTIONRNAforester calculates RNA secondary structure alignments, both pairwise and multiple.The comparison is based on the tree alignment model [1,2]..SS Model The model for pairwise and multiple alignment differs slightly. The pairwise modelis based on the following edit operations on sequence and structure: .br\fIbasepair replacement/match:\fP A basepair, INCLUDING the paired bases, is substituted by another basepair. The scoring contribution is p_m..br\fIbasepair bond deletion:\fP A basepair bond WITHOUT the paired bases is removed. The scoring contribution is p_d..br\fISequence edit operations:\fP Base match/mismatch and base deletion give the scoring contributions b_m and b_d, respectively..brIn the multiple alignment mode (-m), parameter p_m is the score for matching a basepair bond WITHOUT the paired bases.Thus, the score for a whole basepair replacement is p_m+2*b_m. For more information about multiple alignment refer tothe description of parameter -m..SS InputRNAforester reads RNA secondary structures from stdin by default.It accepts sequences and structures in Fasta format, where matching brackets symbolize basepairs and unpaired bases are represented by a dot. A line containing the primary sequencecan precede the RNA secondary structure(s). An example is given below:.br > test accaguuacccauucgggaaccggu primary structure .((..(((...)))..((..)))). secondary structure.brAll characters after a "blank" are ignored and all '-' characters are removed.The program will continue to read newstructures until a line consisting of the single character @ or an end of fileis encountered. Input lines starting with > can contain a structure name. Option -f=filename let RNAforester read the input from file. Results files are then written to files prefixed by filename..SS OutputAlignments in ASCII format are written to stdout. Option -2d generates postscriptdrawings of structure alignments..SH Options.TP\fB-d\fPCalculate distance instead of similarity. In contrast to similarity, scoring contributions are minimized.The scoring parameters must not be negative and equal structures achieve a distance of zero. This parameter can not be used in conjunction with multiple alignment, where relative similarity is computed..TP\fB-r\fPCalculate relative score, defined by sr(a,b)=2*s(a,b)/(s(a,a)+s(b,b).Relative scores are upper bounded by 1 which is the score for equal structures. .TP\fB-l\fPCalculate local similar structures. The term local refers to subwords ofthe input sequences and structures. If parameter \fI-so\fP is used suboptimal solutions are calculated. This does not mean suboptimal solutions of thesame local structures, but different substructures which do not include each other. .TP\fB-so=int\fPCalculates suboptimal local alignments within int% of the optimum. This option requiresoption \fI-l\fP..TP\fB-s\fPCalculates small-in-large similarity, i.e. the best alignment of the first structure against all substructures of the second structure is computed..TP\fP-m, -mc=double, -mt=double, -cmin=double\fPMultiple alignment mode. Multiple alignments of structures are calculated in a progressivefashion. First, an all-against-all comparison of structures is performed (relative scores) and afterwardsstructural alignments are joined along a guide tree (the guide tree is constructed dynamically).If the best score which a single structure or structure alignment can achieve by aligning to all othersis below cutoff value \fI-mc\fP, it is not joined and put into the results list. Thus, a multiple structure alignment can produce a list of alignments. The main purpose of parameter \fI-mc\fP is toidentify alternative and wrong structures produced by structure predictions. The default value for\fI-mc\fP is zero, as this separates similar from dissimilar in a similarity scoring model.In each step in the multiple alignment calculation, the best scoring pair is joined and then the guide tree isadjusted. To speed up computation, parameter \fI-mt\fP defines a threshold whereas, if this is exceeded, multiple pairs are joined and then the guide tree is adjusted.Besides sequence and structure alignment, a consensus sequence and structure is computed. The minimum pair frequency probability for a basepair in the consensus sequence is controlled by parameter \fI-cmin\fP.The console output could look like (just a part):.br * * **** * * **** ** * **** ** * **** * ** * **** ******** **** ** * **** ******** **** ** * **** ******** **** **************** ** * **************** ****** **************** ** **************************** **************** ** **************************** ggggcuauagcucagcugggggagcuauagcucagcugggagcgggga .((((....))))....((.(.(((((..((((........))))... ************************************************ **************** ** **************************** **************** ** ** ************************* **************** ** * *************** ******* ** * **** ******** ***** ** * **** ******** ***** ** * **** ******* *** * ** * **** * * * **** * * **** .brThe number of * above the primary sequence shows the frequency of the base.Each * stands for 10% frequency. Accordingly, the number of * below thesecondary structure show the frequency of the occurrence of a paired or unpairedbase.The guide tree is written to a file "cluster.dot" in \fIdot\fP format. If a filename was specified by parameter \fI-f\fP the filename is "filename_cluster.dot". Refer to \fIhttp://www.research.att.com/sw/tools/graphviz\fP for more details about the dot format and tools..TP\fI-p, -pmin=double\fPStructures (in fact, a consensus of compatible structures) are predicted from the partition function which is calculated using the Vienna RNA library [3]. Structure lines in the input are ignored.\fI-pmin\fP is the minimum frequency of a basepair which must be exceeded to be considered for theprediction of structures..TP\fI-pm=int,-pd=int,-bm=int,-br=int,-bd=int\fPScoring parameters. Refer to Section DESCRIPTION..TP\fI--RIBOSUM\fPUses the base and basepair substitution matrix RIBOSUM85-60 matrix as proposed in [4].Requires pairwise alignment model..TP\fI-2d\fPRNAforester provides different types of visualizations for pairwise and multiple alignment.\fBpairwise alignment\fPSince bases paired in a structure S1 can be aligned to bases unpaired in a structure S2, the presentation of a common secondary structure leaves some choice. For an alignment of those structures, an RNA secondary structure "$S2-at-S1" is drawn that highlights the differences as deviations of S2 from S1, or vice versa, "S1-at-S2". Both are alternative visualizations of the same alignment. Bases printed in black show structure elements that occur in both structures with the same sequence. Sequence variations are displayed by using red letters. Bases or base pairs that can only be found in S1 are printed in blue, while bases that only occur in S2 are printed in green.The drawings are written to files "x_n.ps" and "y_n.ps" where n is the number of the alignment. n enumerates the suboptimal solutions if option \fI-so\fP is used.The region of local similarity are highlighted in the original structures in the drawings "x_str.ps" and "y_str.ps".\fBmultiple alignment\fPEach cluster of the result list of a multiple alignment is visualized in two alternative drawings, written to the files "filename_cons_n.ps" and "filename_n_.ps"if option \fI-f\fP is used. In both plots, the consensus structure is shown. The lighter a basepair bond is drawn, the less frequent does it exist in the structures. Bases or basepair bonds that have a frequency of one hundred percent are drawn in red color. In "filename_cons_n.ps", the most frequent base at each residue is printed,with the base frequency indicated by grey-scale. In "filename_n.ps", the frequencies of the bases a,c,g,u are proportional to the radius of circlesthat are arranged clockwise on the corners of a square, starting at the upper left corner. Additionally, these circles are coloredred, green, blue, magenta for the bases a,c,g,u, respectively. The frequency of a gap is proportional to a black circle growing at the center of the square.Parameters \fI--2d_hidebasenum,--2d_basenuminterval=n,--2d_grey,--2d_scale=double\fP effect the drawings of alignments and consensus structures as implied by their names..TP\fI--score\fPOnly the optimal score of an alignment is printed. This option is useful when RNA-forester is called by another program that only needs a similarity or distance value..TP\fI--fasta\fPAlignments are printed in Fasta format.SH REFERENCES[1] Jiang T, Wang J T L and Zhang K, (1995)Alignment of Trees - An Alternative to Tree Edit,Theoretical Computer Science 143(1), 137-148[2] Hoechsmann M, Toeller T, Giegerich R and Kurtz S, (2003)Local Similarity of RNA Secondary Structures,Proc. of the IEEE Bioinformatics Conference (CSB 2003), 159-168[3] Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, L. Sebastian Bonhoeffer, Manfred Tacker, and Peter Schuster, (1994)Fast Folding and Comparison of RNA Secondary Structures,Monatsh.Chem. 125: 167-188.[4] Klein R.J. and Eddy S.R., (2003)RSEARCH: finding homologs of single structured RNA sequences,BMC Bioinformatics. 2003 Sep 22;4(1):44 .SH VERSIONThis man page documents version 1.4 of RNAforester..SH AUTHORSMatthias Hoechsmann.SH BUGSI hope you wouldn't find them.Comments should be sent to mhoechsm@techfak.uni-bielefeld.de.br
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -