A case study on choosing normalization methods and test statistics for two-channel microarray data

Yang Xie; Kyeong S. Jeong; Wei Pan; Arkady B Khodursky; Brad Carlin

doi:10.1002/cfg.416

A case study on choosing normalization methods and test statistics for two-channel microarray data

Yang Xie, Kyeong S. Jeong, Wei Pan, Arkady B Khodursky, Brad Carlin

Research output: Contribution to journal › Article › peer-review

15 Scopus citations

Abstract

DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA-protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.

Original language	English (US)
Pages (from-to)	432-444
Number of pages	13
Journal	Comparative and Functional Genomics
Volume	5
Issue number	5
DOIs	https://doi.org/10.1002/cfg.416
State	Published - Jul 2004

Keywords

Background correction
Empirical Bayes methods
False discovery rate
Normalization
Significance testing
Spatial effects

Access

10.1002/cfg.416

OpenUrl availability

Full text

Cite this

@article{c603d01059df4267b9b1a81e21476043,

title = "A case study on choosing normalization methods and test statistics for two-channel microarray data",

abstract = "DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA-protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.",

keywords = "Background correction, Empirical Bayes methods, False discovery rate, Normalization, Significance testing, Spatial effects",

author = "Yang Xie and Jeong, {Kyeong S.} and Wei Pan and Khodursky, {Arkady B} and Brad Carlin",

year = "2004",

month = jul,

doi = "10.1002/cfg.416",

language = "English (US)",

volume = "5",

pages = "432--444",

journal = "Comparative and Functional Genomics",

issn = "1531-6912",

publisher = "Hindawi Publishing Corporation",

number = "5",

}

TY - JOUR

T1 - A case study on choosing normalization methods and test statistics for two-channel microarray data

AU - Xie, Yang

AU - Jeong, Kyeong S.

AU - Pan, Wei

AU - Khodursky, Arkady B

AU - Carlin, Brad

PY - 2004/7

Y1 - 2004/7

N2 - DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA-protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.

AB - DNA microarray analysis is a biological technology which permits the whole genome to be monitored simultaneously on a single slide. Microarray technology not only opens an exciting research area for biologists, but also provides significant new challenges to statisticians. Two very common questions in the analysis of microarray data are, first, should we normalize arrays to remove potential systematic biases, and if so, what normalization method should we use? Second, how should we then implement tests of statistical significance? Straightforward and uniform answers to these questions remain elusive. In this paper, we use a real data example to illustrate a practical approach to addressing these questions. Our data is taken from a DNA-protein binding microarray experiment aimed at furthering our understanding of transcription regulation mechanisms, one of the most important issues in biology. For the purpose of preprocessing data, we suggest looking at descriptive plots first to decide whether we need preliminary normalization and, if so, how this should be accomplished. For subsequent comparative inference, we recommend use of an empirical Bayes method (the B statistic), since it performs much better than traditional methods, such as the sample mean (M statistic) and Student's t statistic, and it is also relatively easy to compute and explain compared to the others. The false discovery rate (FDR) is used to evaluate the different methods, and our comparative results lend support to our above suggestions.

KW - Background correction

KW - Empirical Bayes methods

KW - False discovery rate

KW - Normalization

KW - Significance testing

KW - Spatial effects

UR - http://www.scopus.com/inward/record.url?scp=4444350754&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4444350754&partnerID=8YFLogxK

U2 - 10.1002/cfg.416

DO - 10.1002/cfg.416

M3 - Article

C2 - 18629172

AN - SCOPUS:4444350754

SN - 1531-6912

VL - 5

SP - 432

EP - 444

JO - Comparative and Functional Genomics

JF - Comparative and Functional Genomics

IS - 5

ER -

A case study on choosing normalization methods and test statistics for two-channel microarray data

Abstract

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this