|
In living organisms, deoxyribonucleic acid (DNA) is the carrier of genetic information from one generation to the next and encodes proteins, which are the actual building blocks of cells and participate in most processes within cells. DNA molecules are composed of several different regions, and the DNA regions that encode proteins are called genes. A DNA microarray is a collection of
microscopic spots, each of which can detect a specific type of gene. Using DNA microarrays, researchers now can monitor the expression of thousand of genes simultaneously, thus making numerous discoveries previously unappreciated. With the explosion of genomic data by high-throughput biology such as DNA microarrays, many people observe that biology is becoming information science.
The process of detecting gene expression through DNA microarrays provides ample opportunities for electrical engineers and computer scientists as well. For instance, the photolithography techniques, which has been widely used in the semiconductor industry, are now being used to manufacture certain types of DNA microarrays. Biological signals can be converted to electrical signals and then be processed by analog-to-digital conversion (ADC) circuits. Biological signals also tend to be very noisy, and signal processing techniques developed
in engineering have utmost value in biological signal analysis. DNA molecules are sequences of four alphabets, and numerous discrete mathematics methodologies borrowed from the computer science community have found frequent usage in modeling biological systems and analyzing signals empirically obtained.
In this survey, fundamentals of DNA microarray technologies and analysis methods are reviewed. In the first section, a molecular biology primer for engineers and computer scientists is presented, along with a brief introduction to two widely used DNA microarray technologies. The second section provides foundations for DNA microarray data analysis covering several important issues in DNA microarray data preprocessing
and statistical hypothesis testing. In the third section, various data analysis methods are described with special emphasis on unsupervised clustering and supervised classification. The last section introduces a few applications of DNA microarrays in biomedicine.
|