?? fastdnaml_doc_1.2.txt
字號:
...There is an alternative format: the frequencies can be anywhere in the list ofauxilliary data lines if they are preceded by an F in the first column: 5 114 F C W F 0.25 0.30 0.20 0.25 C ... ... W ... Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...G -- GlobalIf the global option is specified, there may also be an [optional] auxiliarydata line of form: G N1or G N1 N2N1 is the number of branches to cross in rearrangements of the completed tree.The value of N2 is the number of branches to cross in testing rearrangementsduring the sequential addition phase of tree inference. N1 = 1: local rearrangement (default without G option) 1 < N1 < numsp-3: regional rearrangements (crossing N1 branches) N1>= numsp-3: global rearrangements (default with G option) N2 <= N1 the default N2 is 1, local rearrangements.The G option can also be used to force branch swapping on user trees, that is,a combination of G and U options.If the auxiliary line is supplied, it cannot be the last line of auxiliarydata. (It may be necessary to add the T option with an auxiliary data line of T 2.0if no other auxiliary data are used.)Examples:Do local rearrangements after each addition, and global after last addition: 5 114 G Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...Do local rearrangements after each addition, and regional (crossing 4branches) after last addition: 5 114 G T G 4 T 2.0 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...Do no rearrangements after each addition, and local after last addition: 5 114 G T G 1 0 T 2.0 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...PHYLIP DNAML does not support the auxiliary data line or branch swapping on auser tree.I -- Not InterleavedBy default, fastDNAml 1.2 expects data lines for the various sequences in aninterleaved format (as did PHYLIP 3.3 DNAML). The I option reverses theexpected format (to non-interleaved data, in which all the data lines for onesequence appear before the next sequence begins). This is particularly usefulfor editing a GenBank or equivalent format into a valid input file (note thatnumbers within the sequence data are ignored, so it is not necessary to removethem).If all the data for each sequence are on one line, then the interleaved andnon-interleaved formats are degenerate. (This is the way David Swofford'sPAUP program writes PHYLIP format output files.) The drawback is that manyprograms do not handle long lines of text. This includes the vi and EDT texteditors, many electronic mail programs, and some versions of FTP for VAX/VMSsystems.PHYLIP 3.3 DNAML expects interleaved data, and does not include an I option toalter this. PHYLIP 3.4 DNAML accepts an I option, but the default format isreversed.J -- JumbleRandomize the sequence addition order. Requires an auxiliary input line ofthe form: J random_number_seedExample: 5 114 J J 137 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...Note that fastDNAml explores a very small number of alternative treetopologies relative to a typical parsimony program. There is a very realchance that the search procedure will not find the tree topology with thehighest likelihood. Altering the order of taxon addition and comparing thetrees found is a fairly efficient method for testing convergence. Typically,it would be nice to find the same best tree at least twice (if not threetimes), as opposed to simply performing some fixed number of jumbles andhoping that at least one of them will be the optimum.K -- Keep multiple best trees (***** New in version 1.1 *****)The program can keep a list of the best trees that it has found. When theprogram is done, it prints a list of these, from best to worst, and printsa Hasegawa and Kishino type test as to which trees are significantly worsethan the best tree found. When evaluating user-supplied trees, the programautomatically keeps all trees. In other situations, the program keeps onlythe best tree that it has found. The K option, and associated auxilliarydata line, can be used to define an alternate number:Example, to keep the 15 best trees found: 5 114 K K 15 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...Example, to keep only the one best tree of possibly numerous user-suppliedtrees: 5 114 K U K 1 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...L -- User LengthsCauses user trees to be read with branch lengths (and it is an error to omitany of them). Without the L option, branch lengths in user trees are notrequired, and are ignored if present.Example: 5 114 U L Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...(The U is for user tree and the L for user lengths)O -- OutgroupUse the specified sequence number for the outgroup. Requires an auxiliarydata line of the form: O outgroup_numberExample: 5 114 O O 5 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...This option only affects the way the tree is drawn (and written to thetreefile).Q -- Quickadd (***** Changed in version 1.1 *****)The quickadd feature greatly decreases the time in initially placing a newsequence in the growing tree (but does not change the time required tosubsequently test rearrangements). The overall time savings seems to be about30%, based on a number of test cases. Its downside, if any, is unknown. Thisis now (starting in version 1.1) the default program behavior.If the analysis is run with a global option of "G 0 0", so that norearrangements are permitted, the tree is built very approximately, but veryquickly. This may be of greatest interest if the question is, "Where doesthis one new sequence fit into this known tree? The known tree is providedwith the restart option (below).PHYLIP DNAML does not include anything comparable to the quickadd feature.The quickadd feature can be turned OFF by adding a Q to the first line of theinput file.R -- RestartThe R option causes the program to read a user-supplied tree with less thanthe full number of taxa as the starting point for sequential addition of theremaining taxa. Thus, the sequence data must be followed by a valid (Newickformat) tree. (The phylip_tree/2, prolog fact format, is now also supported.)The restart option can also be used to increase the range of the search foralternative (better) trees. For example, you can take a tree produced withonly "local" tree rearrangements, and increase the rearrangements to"regional" or "global" by combining the appropriate global option with therestart option. If the starting tree was written by fastDNAml, then theextent of rearrangements is saved with the tree, and will be used as thestarting point for the additional search. If the tree was already globallyoptimized, then no additional searching will be performed.To support the R option, after each taxon is added to the growing tree, andafter each round of rearrangements, the program appends a checkpoint tree to afile called checkpoint.PID, where PID is the process number of the runningfastDNAml program. The last line of this file needs to be appended to theinput file when the R option is used. (This should not be confused with the U(user tree) option, which expects a number followed by that number of trees.No additional taxa are added to user trees.)The UNIX utility tail can be used to remove the last tree from the checkpointfile, and the utility cat can be used to append it to the input. For example,the following script can be used to add a starting tree and the R option to adata file, and restart fastDNAml: #! /bin/sh if test $# -ne 1 then echo "Usage: restart checkpoint_file" exit fi read first_line # first line of data file echo "$first_line R" # add restart option cat - # rest of data file tail -1 $1 # append last tree in checkpoint fileIf this shell script is in the file called restart, then one might use thecommand: restart checkpoint.21312 < infile | fastDNAml > new_outfile ^script ^checkpoint tree ^data ^dnaml program ^output_fileIf this is too opaque, don't worry about it, or talk with your local unixwizard. In the mean time, this and other useful shell scripts are providedwith the program.PHYLIP DNAML does not write checkpoint trees and does not have a restartoption.T -- Transition/transversion ratioUse a user-specified ratio of transition to transversion type substitutions.Without the T option, a value of 2.0 is used. Requires an auxiliary data lineof the form: T ratioExample: 5 114 T T 1.0 Sequence1 ACACGGTGTCGTATCATGCTGCAGGATGCTAGACTGCGTCANATGTTCGTACTAACTGTG ...(Note that a T option with a value of 2.0 does nothing, but it can providea last auxiliary data line following optional auxiliary data. See theexamples for G and Y.)U -- User Tree(s)Read an input line with the number of user-specified trees, followed by thespecified number of trees. These data immediately follow the sequence data.The trees must be in Newick format, and terminated with a semicolon. (Theprogram also accepts a pseudo_newick format, which is a valid prolog fact.)The tree reader in this program is more powerful than that in PHYLIP 3.3. Inparticular, material enclosed in square brackets, [ like this ], is ignored ascomments; taxa names can be wrapped in single quotation marks to support theinclusion of characters that would otherwise end the name (i.e., '(', ')',':', ';', '[', ']', ',' and ' '); names of internal nodes are properlyignored; and exponential notation (such as 1.0E-6) for branch lengths issupported.W -- Weights
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -