PUMAdb : Data selection for Analysis Help

Your session is inactive. Login

Contents

Related Help Documents


Description

The Data Selection for Analysis tool is available only after you have selected a set of hybridized arrays using either the Basic Search or the Advanced Search programs. Once a set has been selected, Data Selection for Analysis allows you to select genes or spots to cluster, and to filter data based on a variety of parameters. This tool can be used to generate a preclustering (.pcl) file, or the files needed for viewing a cluster with TreeView. In addition, Data Selection for Analysis will lead you to tools that will let you view clustered data via the web.

Data Selection for Analysis is split into three large steps:

Gene Selection Options

Although we use the word 'gene,' it really refers to any DNA sample spotted on the microarrays. A 'gene' might be a PCR product representing an entire section of a gene, a portion of a gene, a clone associated with a gene, an intergenic region or anything at all.

This section allows you to first specify which genes are of interest to you, then decide how to collapse your data, how to identify genes in your output file, select biological annotation and to choose a way to label the arrays you're using.

Data Filtering Options

This section of the tool allows you to choose what data you think is reliable enough to include in your analysis. The steps are:

Gene Filtering Options

There are several steps to this part of the tool. Which options appear depends on what sort of data you have retrieved. Operations are carried out in the order in which they are presented on the page. The steps are:

Viewing Clustering Results

Once you've submitted a clustering query, you will see a page where text writes to your screen. When the preclustering file is complete, the last line will read, "...genes were selected."

Data Analysis

PUMAdb allows you to perform some data analysis on your preclustering file, using either of two methods:

Clustering Options

You have to define the following options when hierarchically clustering

  • If you want to generate a file of sorted correlations, the default correlation is .8. Click 'Submit' when you have chosen the appropriate options.

    Image Generation Options

    Here are a couple tips that will help you optimize the time it takes to analyze the experiments you selected.
    • Selecting 'Show spot images' will slow down the analysis.
    • Broken up images load faster and can be navigated more quickly than unbroken images.

    Browsing, Viewing, and Downloading Clustered Data

    To interactively browse the clustered data, click the red and green image in the lower left-hand corner of the window. This takes you to the 'Hierarchical Cluster View' where you can focus on specific gene sub-clusters.
    • The map on the left contains the entire cluster, and its size can be changed by entering new parameters in the upper left-hand corner.
    • Clicking on this map changes the view of the graph on the right, which contains the experiment names as columns and gene names as rows.
    You can view the clustered data in the following ways.
    • 'View broken images' displays a .gif of the clustered genes based on the average retrieved value.
    • 'View broken spot images' displays a .gif of the clustered genes. The spots of the experiment are displayed in a way that allows you to see the variation within the spot.
    • 'View joint broken images' places both the above .gifs in the same window. If you don't see the broken spot image, scroll left to bring it onto your screen.
    • Clicking on 'pcl' at the bottom of the screen allows you view the preclustering file.
    The other links at the bottom of the screen download files to your machine.
    • 'cdt' downloads the complete tree view datafile.
    • 'gtr' downloads the genetree view datafile, which describes the tree of clustered genes.
    • If you chose an experiment clustering option on the previous page, you will also have the option to click on 'atr' to download the arraytree file.