[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
1. Introduction 2. Installation 3. Use XAT 4. Output Format
XAT (cross-species alignment tool) comes as a cross-species cDNA-to-genome alignment tools at nucleotide level. It is designed to be used on three conditions: accurate intra-species cDNA-to-genome alignment, fast positioning for cross-species mapping, and gene structure annotation for well aligned regions that contain no frame-shifting indels.
In technical angle, XAT incorporates several heuristic techniques used in Blastz and SIM4, and also inspires some other ideas in performance enhancing and statistical testing. It is fast, sensitive and fairly accurate. It is capable of genome-wide alignment in a considerable speed, and can find less conserved regions with statistical reliability. XAT shows that heuristic algorithm can still achieve a high speed without losing sensitivity.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
2.1 System Requirement 2.2 Compilation
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
XAT is written in C. It is known to work in i386-Linux, powerpc-AIX and MIPS-IRIX, and should be ported to any POSIX-compatile system. XAT is available in 32-bit environment, but it is recommanded to compiled XAT in 64-bit systems where XAT will be faster and more sensitive. Large memory helps performance, too.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In 32-bit systems, you should type
|
|
xat
will be generated in directory `bin'. You need to set evironment variable
`XAT_CONFIG_FILE' to tell XAT the location of the configure file. In sh
shell,
you can achieve this by
cd config; export XAT_CONFIG_FILE=`pwd`/xat_config |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
3.1 Invoking 3.2 Command-Line Options
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
|
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
3.2.1 Basic Options 3.2.2 Advanced Options 3.2.3 Debug Options
The full XAT command line is
|
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
These options are mainly used for program debugging. Do not change them even if you know about the XAT algorithm.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
4.1 Standard Format 4.2 Cigar Format
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here is a man-made example:
>mRNA 2910 851 974 + GENOME 14713 6857 10492 1000 57 851 914 6857 6920 2.33e-28 -> 1 64, S, gaggaaacagcagactttagaagcggaagaggccaagaggcggttgaaggagcagtctatcttt |||||||||||||| |.|.||||| |||||.||||||||| ||||.|||||||||||||||||| GAGGAAACAGCAGAGTCTGGAAGCTGAAGAAGCCAAGAGGAGGTTAAAGGAGCAGTCTATCTTT 54 915 974 10433 10492 4.89e-27 -> 5 17,1,9,1,32, S,D,S,I,S, ggtgaccatcgggatga-gaggaagagacccacatgaagaagtcagagtcggaggtggag |||||||| |||||||| ||||||||| |||| |||||||||||.|||||.||||||||| GGTGACCAGCGGGATGAAGAGGAAGAG-CCCAGATGAAGAAGTCGGAGTCAGAGGTGGAG // |
Each alignment begins with a `>' and ends with `//'. The first `>' line contains the fields: mRNA sequence name, mRNA length, mRNA start position, mRNA position, strand, genome sequence name, gen_seq length, gen_seq start position, gen_seq stop position and score.
The following lines report the detailed alignments of each exon. The line started with a number consists of the number of matched bases, mRNA start position, mRNA stop position, gen_seq start position, gen_seq stop position, the first kind of P-value, direction, the number of fragments, the length of each fragments and the type of each fragments. In theory, one can reconstruct the alignment with this line. The output alignments are only to faciliate observations.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Cigar stands for Concise Idiosyncratic Gapped Alignment Report. Cigar format is one of the output formats which can be generated by Exonerate It is also used in the feature tables in the Ensembl database, but in an altered form. It is designed to contain the minimal information necessary for the reconstruction of an alignment. One alignment is described per line, to allow easy manipulation with UNIX tools.
The example above can be translated into Cigar format:
mRNA 851 974 + GENOME 6857 10492 + 1000 M 64 N 3512 M 17 D 1 M 9 I 1 M 32 |
Note that standard cigar format does not contain `B', as exonerate permits no breaking point in the cDNA sequence. If `linkgap=1' is specified, XAT will permits no breaking point, either; if not, `B' will appear showing that some cDNA fragments are not aligned.
[Top] | [Contents] | [Index] | [ ? ] |
[Top] | [Contents] | [Index] | [ ? ] |
1. Introduction
2. Installation
3. Use XAT
4. Output Format
[Top] | [Contents] | [Index] | [ ? ] |
Button | Name | Go to | From 1.2.3 go to |
---|---|---|---|
[ < ] | Back | previous section in reading order | 1.2.2 |
[ > ] | Forward | next section in reading order | 1.2.4 |
[ << ] | FastBack | previous or up-and-previous section | 1.1 |
[ Up ] | Up | up section | 1.2 |
[ >> ] | FastForward | next or up-and-next section | 1.3 |
[Top] | Top | cover (top) of document | |
[Contents] | Contents | table of contents | |
[Index] | Index | concept index | |
[ ? ] | About | this page |