Contents
- Description
- Combine or distinguish arrays with the same identifier
- Translate or match gene identifiers
- Compute averages
Related Help Documents
- Data Repository: Description of and instructions for your repository for data and analyses.
- File Formats: Information about preclustering (.pcl), clustered data table (.cdt), gene tree (.gtr) and array tree (.atr) files generated in the process of clustering data.
The PCL merge tool allows you to combine PCL files. You may upload one or two files from your desktop computer, and/or select one or more PCL deposits from your repository. This process may be useful for co-clustering your data with data from another source, for adding rows of clinical values to gene expression data, etc. When the merging process is complete, you may download the new file, enter it into your repository, or cluster it.
Combine or distinguish arrays with the same identifier
You must choose what to do with columns (experiments) in different files but with the same identifier: merge them, averaging values for any rows (genes) in common between them; or make them distinct, by appending the name of the file from which they came. Merging columns is a convenient way to combine data if you have hyrbridized a single sample to two or more microarrays comprising a single "chip set," e.g., Affymetrix HG-U133A and HG-U133B arrays. Note that in this application you would likely have to edit at least one of the files by hand, in order to make the column headers (array identifiers) the same for the two arrays in each chip set. Note also that two columns with the same identifier in a single file will always be merged, whether or not you select the merge option.
Translate or match gene identifiers
You may optionally provide a "translation file" to match up different row identifiers (UID's or gene names) in the various files. For example, if you are combining data from spotted cDNA arrays and Affymetrix GeneChipstm, you might want to translate the Affymetrix probe set names into the nearest equivalent clone ID, or translate both into their corresponding UniGene clusters. To do this, you may upload a tab-delimited text file in the following format:
Column | Contents |
---|---|
1 | Final desired identifier (Hs.408312) |
2 | Final desired annotation (TP53) |
3 and onward | Identifiers to be translated to the final desired
identifier (one per column) (IMAGE:1208978, IMAGE:1508462, ...) |
Finally, you must select the method, mean or median, by which values will be averaged after merging and translation (if any). For example, if two columns are merged, and each contains three rows that are translated to the same final identifier, six values must be averaged to obtain the final value. Either the mean or median will be calculated, according to your selection. Note that each identifier will appear only once in the final file, with averaging as required to produce this result.
Gene and experiment weights will also will be averaged by the method selected.