PUMAdb : Arraylists Help

Your session is inactive. Login

Contents


Description

Arraylists are text files which specify a group of arrays or hybridizations.  They can be used in data retrieval, experimental annotation, and other tasks. Result set lists are arraylists that contain additional information, and should be used in all cases in which the database may store more than one version of the data for the arrays in question. For example, a result set list can distinguish between two different sets of normalized data for an Agilent array, or between the cell and probe-set data for an Affymetrix array. (GenePix and ScanAlyze arrays only have a single version of data in the database.)

Only registered users of the database may use arraylists and result set lists, since the files are stored locally in your loader account, in the "arraylists" directory.  They are accessible via sFTP, and may be downloaded, created or edited on your desktop machine, and uploaded as desired.

Result set lists subsume all of the functions of arraylists, and are the preferred format (see
below).  Use of arraylists may, potentially, fail due to possible ambiguity in specifying result sets (versions of data for a single array).

Result set lists may be used for data retrieval.  Frequently-used groups of arrays or result sets may be combined into a result set list, and selected as a group in Advanced Results Search for easy access.  The result set list may specify filters for each array individually, allowing customization of parameters for data retrieval on a per-array basis (see below).

Result set lists may also be used to retrieve experiment annotations, normalize data, assign access, or create an experiment set for a group of arrays simultaneously.

Creating a result set list with the web interface

To create a result set list via the database's web interface, begin with the Advanced Results Search.  After making initial selections, click the "Data Retrieval and Analysis" button.  On the second page, refine the selection of arrays or result sets, and click the "Create Result Set List" button.  This will take you to the Result Set List interface, in a new browser window.

Result set list creation interface - first page
The first page of the interface (above) allows you to organize your result sets and select a set of filter templates.  Make a final selection of result sets to include in the list, and sort as desired.  Then, if desired, select default filters.  These are the same filter options that are available in data retrieval and various other tools.   They differ for data from different software (GenePix/ScanAlyze vs. Agilent Feature Extraction Software vs. Affymetrix GCOS/dChip) - see the list of options for each software package.  Turn on and configure as many filters as you need.  Then give your list a name, and click on the "Continue" button.

Result set list creation interface - second page
The second page of the interface (above) allows you to further customize the filters for each result set.  If you don't want to refine your filter selections or don't want to select any at all, just click on the "Create ResultSetList" button.  Or, if desired, you may customize any or all filters for each result set.  There is absolutely no requirement that each result set use the same filters.  When finished, click on "Create ResultSetList" to put the list in your "arraylists" directory for future use.  You will see a summary page (below), and a link to download your new result set list if you want to edit it in a spreadsheet.

Result set list creation interface - summary page
Creating a result set list on your desktop computer

Result set lists are tab-delimited text files, which may be easily created in a spreadsheet program like Excel.  See below for the format.  It may also be convenient to download an already-created result set list, and edit it to add or remove result sets (arrays), or customize the per-array filters.  Note that filter fields must be given exactly as specified in the list (these are exactly the filters available during data retrieval), and will produce an error if mis-spelled.  It may be easier to create a template using the online interface, and then edit it as necessary, than to create a list de novo on your desktop computer. You may download an example result set list here.

When finished, save the list as a tab-delimited text file, and upload it to your loader "arraylists" directory using sFTP.

Result set list format

Arraylists and result set lists are tab-delimited, text files.  The following column headers are understood

Name
Requirements
Meaning
EXPTID or SLIDENAME or EXPTNAME
One and only one must be included.
Identifies the array or hybridization.
PACKAGE
One of GENEPIX (includes ScanAlyze), AGILENT, AFFYCELL (Affymetrix cell data), or AFFYPS (Affymetrix probe set data).
Part of identifying the specific result set.  Not required for GenePix or ScanAlyze data.  If omitted, the result set will be inferred if possible.  If multiple possibilities exist (multiple result sets for the specified array), an error will be reported when the list is used.
RESULT_SET_NO
Required if and only if the PACKAGE column is present.
See PACKAGE.
EWEIGHT
A value between 0 and 1 is required.
Establishes the weight for the result set in clustering.
FILTER_i_FIELD
Optional.  i must be a non-negative integer. Value must be drawn from list.  The corresponding operator and value must be included (see below).
The data field for use in the ith filter.  Filter fields must be drawn from the list for the specified software package, and must be spelled exactly as given in the list.
FILTER_i_OPERATOR
Optional - see requirements for "FILTER_i_FIELD."
The comparison operator for the ith filter.
FILTER_i_VALUE
Optional - see requirements for "FILTER_i_FIELD."
The value for comparison in the ith filter.
LOGIC
Optional
A logical combination of filters, e.g. "1 AND (2 or 3)".  Defaults to a logical AND if not specified.  If given, must contain all, and only, the numbers of the filters specified for the same result set.