-
Notifications
You must be signed in to change notification settings - Fork 10
User Help
The Genome Context Viewer is a micro-synteny viewer that can load data from multiple sources simultaneously, enabling the comparative analysis of data curated by independent data providers.
Sources are data providers that allow the Genome Context Viewer access to their databases via web services. The Genome Context Viewer can load data from multiple sources simultaneously, enabling the comparative analysis of data curated by independent providers.
A genome context is simply a region of a genome considered only with respect to the ordering and orientation of its annotated gene content. The primary visualization of a context in the Genome Context Viewer is the micro-synteny viewer, which depicts contexts as a horizontal tracks in which triangular glyphs represent the genes in the order that they occur in the segment. Gene glyphs have a directionality that indicates the strand orientation of the gene with respect to the other genes in the segment. Horizontal lines connecting the gene glyphs reflect the physical intergenic distances with thickness.
Most significantly, the gene glyphs are assigned colors that reflect their functional annotation, providing a visual overview of which genes within and between tracks share significant similarity due to some homologous relationship between them. The association of each color with an annotation in the view is given in a Legend. Note that colors are only assigned to annotations represented by at least two different genes in the view; genes assigned to annotations that have no other representatives in the view are colored white with a solid outline, while genes without annotation assignments appear as white with dashed outlines. Such genes are depicted in the legend as singletons and orphans, respectively.
What tracks are drawn in the micro-synteny viewer is dictated by a set of query genes provided by the user. Given one or more query genes, a micro-synteny query track is constructed for each gene from the genes, or neighbors, that flank it on its chromosome. The Genome Context Viewer then searches one or more sources for tracks that have similar functional annotation content.
- Neighbors - The number of flanking genes used when constructing query tracks.
When multiple query tracks are present, they are clustered using agglomerative hierarchical clustering with Levenshtein distance as the distance metric. This allows users to consider distinct contexts simultaneously and prevents dissimilar tracks from affecting downstream analysis. Each resulting cluster is displayed in its own micro-synteny viewer.
- Linkage - Determines the distance between clusters as a function of the pairwise distances between their genes.
- Average - The average distance from all the genes in one cluster to all the genes in the other cluster.
- Single - The shortest distance between any gene in one cluster to any gene in the other cluster.
- Complete - The longest distance between any gene in one cluster to any gene in the other cluster.
- Threshold - The maximum distance apart two clusters can be before they are considered distinct clusters.
Micro-synteny search is the primary function of the Genome Context Viewer. For each cluster of query tracks, the Genome Context Viewer searches one or more databases for tracks that have similar functional annotation content. Note, the ordering of the genes is not taken into consideration while searching.
- Min matched annotations - The minimum number of genes a similar track must have that match an annotation from the query tracks.
- Max insertion size - The maximum number of genes that can be in-between matched genes in a similar track.
The tracks within each micro-synteny viewer are aligned based on the functional annotations of the their genes. This is done to emphasize the preservation and variation of structure among the tracks. Query tracks are multiple aligned using a profile hidden Markov model and search result tracks are pairwise aligned to the consensus of the multiple alignment using a local alignment algorithm. Inversions and tandem duplications are drawn on separate lines to emphasize the variation of structure.
- Algorithm - The local alignment algorithm used to align search result tracks.
- Smith-Waterman - Finds the highest scoring local alignment.
- Repeat - Finds one or more local alignments, i.e. tandem duplications and/or multiple sub-alignment within a single track.
- Match - The score used when two gene annotations in the alignment match.
- Mismatch - The score penalty used when two gene annotations in the alignment mismatch.
- Gap - The score penalty used when a gene annotation is inserted or deleted in the alignment.
- Score - The minimum score an alignment must have to be drawn in the viewer.
- Threshold - The minimum score a segment (e.g. an inversion) must have to be included in an alignment.
- Mouseover - Mousing over a gene will fade all other elements and show a popup with additional information about the gene.
- Click - Clicking a gene will open a new window in the view with additional information about the gene. This window can be configured by site administrators to include links to other relevant gene pages/sites.
- Mouseover - Mousing over a functional annotation in the micro-synteny legend will fade all other elements except the genes that have that annotation.
- Click - Click a functional annotation will open a new window in the view with additional information about the annotation. This window can be configured by site administrators to include links to other relevant annotation pages/sites.
- Mouseover - Mousing over the label on the left side of a micro-synteny track will fade everything else except the organism the track belongs to in the macro-synteny legend.
- Click - Clicking the label of a track will open a new window in the view with additional information about the track. This window can be configured by site administrators to include links to other relevant track pages/sites.
- Click - Clicking the plot button on the right side of a micro-synteny track will show a popup with "local" and "global" links. These will open pairwise local and global plots for the track, respectively.
Pairwise gene-loci dot plots between any track and the query tracks can be loaded on-demand to further elucidate structure.
Local plots depict only the genes from the micro-synteny tracks being compared.
Global plots depict the genes from the micro-tracks being compared in addition to all the gene's from the selected track's chromosome that have a functional annotation present in the query track being plotted against.
- Mouseover - Same as micro-synteny gene mouseover.
- Click - Same as micro-synteny gene click.
- Click-drag-release - Clicking and dragging the mouse on a dot plot will highlight a portion of the plot. When the click is released, the plot will zoom to contain only the highlighted region.
- Double-click - Double clicking on a plot will revert it to its original zoom level.
Macro-synteny visualizations are used to put micro-synteny tracks into context by displaying chromosome-scale pairwise synteny blocks for the chromosomes of the query tracks. These blocks are computed on-demand by one or more databases using an MCScanX style algorithm on the genes of the chromosomes using their functional annotations to define homology among the genes. Chromosome-scale pairwise synteny blocks are colored by genus and species.
Reference blocks can be computed using the chromosome of any of the query tracks as a reference. The genomic interval corresponding to the chromosome's query track is highlighted, putting the micro-synteny structures into the context of the macro-synteny structures.
- Min matched annotations - The minimum number of genes a block must have that match an annotation from the reference chromosome.
- Max insertion size - The maximum number of genes that can be in-between matched genes in a block.
- Max annotation size - The maximum number of genes an annotation can have on either chromosome before the annotation isn't considered when computing blocks.
Circos blocks can be computed between the chromosomes of the query tracks within a cluster. As with the reference blocks, the genomic intervals corresponding to the query tracks are highlighted. Additionally, the all-pairs comparison can reveal structures that are preserved among all or a subsets of the query track chromosomes.
Circos blocks shares parameters with reference blocks.
- Mouseover - Mousing over a synteny block will fade all other elements except the block's organism in the macro-synteny legend and show a popup with additional information about the block.
- Mouseover - Mousing over a chromosome will fade all other elements except the chromosome's organism in the macro-synteny legend and micro-synteny tracks from that chromosome.
- Drag-and-drop - Dragging and dropping the bar that highlights the portion of the chromosome that corresponds to a micro-synteny view will load the micro-synteny view for the newly highlighted portion of the chromosome.
- Mouseover - Mousing over an organism in the macro-synteny legend will fade all other elements except the macro-synteny chromosomes and micro-synteny tracks that belong to the organism.
Pipelines are a component of the user interface used to describe a particular view and convey what stage it's at in its data loading process. For instance, there is a pipeline for the Genome Context Viewer itself that depicts the previously described flow of data from a set of query genes to a set of clustered and multiple aligned query tracks. Each stage in a pipeline is depicted as a process, and each process is composed of subprocesses. Each (sub)process has a status that is depicted by a color: green (success), blue (info), yellow (warning), or red (failure). The pipelines and (sub)process status are intended to add clarity to the mechanics of the view while conveying issues that may occur, e.g. a source has gone offline or the alignment score parameter is so high no alignments were drawn.
The Genome Context Viewer uses the docker layout paradigm to organize the various parts of the user interface. This paradigm is composed of windows, containers, and stacks. Each part of the user interface, such as individual visualizations, is contained within a window, and each window is contained within a container, though a container itself can contain multiple windows and containers. Containers are tiled to form a mosaic that fills the view. Users can resize containers by clicking and dragging the divider between them. Users can also move windows from one container to another, or form a new container by clicking a window's tab and dragging it to a new location in the view. Similarly, windows can be stacked within a container by dragging a window to the tabs of an existing container. When the windows in a container are stacked, only one window is visible at a time. Which window is visible is determined by what window's tab the user has select; like tabs in a web browser.
When a view is first loaded there is a window for the micro-synteny legend and a window for the macro-synteny legend, each with its own container. Once the micro-synteny query tracks are loaded and clustered, a window is added for each cluster's micro-synteny viewer. Again, each such window has its own container. These are the primary windows of the view and cannot be closed. All other windows are added dynamically based on user interactions. Whenever a window is dynamically added, it is stacked in the container of the window that spawned it. If an interaction attempts to add a window that's already in the view, the window will be made visible, even if it has been moved from the stack it was originally added to.
The Genome Context Viewer can interact with other applications. If this feature has been enabled by the site administrator, then a broadcast icon will be in the upper-right corner of the viewer. Clicking this icon shows a widget that allows you to enter a communication channel and connect to it. In general what interactions are supported between the Genome Context Viewer and other applications is determined by site administrators. Suffice it to say that multiple instances of the Genome Context Viewer running in the same web browser can interact via this mechanism.
The query genes and all parameter values are encoded in the URL. This allows contexts to be easily bookmarked and shared with others.