|
|
Help:The Comparative Search page allows you to perform several different statistical analyses on the microarray expression data across different subgroups: Never Smokers, Former Smokers and Current Smokers.
The statistical tests currently supported are: the T-Test, R-Test (Pearson Correlation), Wilcoxon Rank Test and Spearman Correlation. In Sections 2 and 3, significance thresholds can be set for the Correlation tests (R-Test and Spearman) and Mean Value Comparison tests (T-Test and Wilcoxon) respectively. For the Mean Value Comparison analyses, a P-Value threshold can be set to only display genes whose subgroup comparison results have a significance level better than the set threshold. Section 3 also allows you to specify which column you wish to apply the threshold to. You may select the raw P-Value or the adjusted Q-Value (Q values are used to correct for the multiple comparison problem. Q-values are the raw P-Value adjusted using the Q-value software by Storey et al.) For Correlation analyses, a Coefficient Value threshold can be set along with the P-Value threshold. The Coefficient value determines the strength of correlation between two variables, for example Coefficient values closer to 1 or -1 indicate stronger positive or negative linear relationships respectively. The default P-Value threshold has been set at 0.05 and the default for the Correlation Coefficient threshold has been set at 0.4 while the default P-Value threshold for T-tests or Wilcox Tests has been set at 0.001. In Section 4, search results can be filtered to display results from genes who pass the above statistic\ al thresholds AND whose GO identifiers (set by the Gene Ontology Consortium) include the specified keywords eg. DNA repair, Cell Cycle or Apoptosis Finally in Section 5, search results can be sorted by various parameters (for example by Minimum P-Valu\ e, Maximum Correlation Coefficient, Fold Change or Gene Chromosomal Location) to display results in a meaningful order. The Advanced Search page allows you to use the schema provided to formulate specific, complex SQL queries on the database. For online documentation on how to construct SQL queries go to the following MySQL site. Please note that only "Select" queries are allowed and that the ";" character is not necessary in the SQL query statement. The Quick Info Search page allows you to quickly obtain a complete data readout of all the Patient Information or Sample tables. These include the Demographic, Smoking History, Lung Function, Diagnosis and Sample Information tables. Alternatively, you can use the second available option on this page to submit a list of Patient IDs and obtain the specified patient information for only the selected patients. In addition, the Quick Info page allows you to quickly determine which patient samples fall under which patient subgroup categories by selecting the "Patient/Sample Class Info" option from the first drop-down menu. The Gene Reference Search page allows you to search the statistical test results (from either the T-Test, Wilcoxon Rank Test, Pearson or Spearman Correlation Tests) of genes specified by the user to be of interest. An added functionality for this page is that it acts as a search engine that will return the Affymetrix ID for any gene based on a short user-specified keyword description of the gene. This functionality is the second option on this page and can be run independently from the first Statistical Result search option. Finally using the third option a user can retrieve the expression levels for all the samples for any given gene. The Filtered Data Download page allows you to download a file containing the complete statistical results for each gene in all the different statistical tests we have performed. The Transcriptome Search page allows you to search the genes that make up the putative transcriptomes for the various subgroups.
In Section 1 of the Transcriptome Search page, you can specify which subgroup transcriptome (Never Smokers, Former Smokers or Current Smokers) you wish to search. The "CORE" transcriptome includes all the genes that are present in 100% of the samples from the group chosen. The 50% transcriptome represents all the genes that are present in at least 50% of the samples from that group. Clicking the appropriate link will bring up a venn-diagram showing the number of genes in each subgroup and the genes that intersect between them
In Section 3, we have a GO annotation tool. By specifying a GO category, you can search and determine whether that category is over or under-represented in any of the putative transcriptomes. The Graphing page allows you to generate various different graphics depicting different microarray expression metrics across the entire dataset. This includes: 1) An Expression Value Histogram for all samples. Given one or several Affx IDs, the page will return a series of JPEG images showing the histogram of expression values across all samples for each AFFX ID specified. 2) A Sample vs Sample Scatterplot, plotting a JPEG image of the expression values for two different samples against each other. In the Clustering Section of the Graphing page, you can generate clustering dendograms of specific sample subgroups (Never Smokers or Current Smokers) by selecting from a choice of clustering methods and distance metrics. The clustering algorithm can use ALL the filtered gene expression values or instead use a user-defined subset. The subset of genes can be selected using different variability parameters which include Max/Min Expression Ratio and Max-Min Expression Difference. The user can specify the number of top variable genes to be used in the clustering or alternatively can select a threshold such that only genes that pass that threshold will be included in the clustering. This page is best viewed under the resolution of 1024x768.Copyright 2004 All Rights Reserved. Trustees of Boston University. |