TY - JOUR
T1 - A case study on choosing normalization methods and test statistics for two-channel microarray data
AU - Xie, Yang
AU - Jeong, Kyeong S.
AU - Pan, Wei
AU - Khodursky, Arkady B
AU - Carlin, Brad
PY - 2004/7
Y1 - 2004/7
N2 - DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA-protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.
AB - DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA-protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.
KW - Background correction
KW - Empirical Bayes methods
KW - False discovery rate
KW - Normalization
KW - Significance testing
KW - Spatial effects
UR - http://www.scopus.com/inward/record.url?scp=4444350754&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=4444350754&partnerID=8YFLogxK
U2 - 10.1002/cfg.416
DO - 10.1002/cfg.416
M3 - Article
C2 - 18629172
AN - SCOPUS:4444350754
SN - 1531-6912
VL - 5
SP - 432
EP - 444
JO - Comparative and Functional Genomics
JF - Comparative and Functional Genomics
IS - 5
ER -