TY - THES ID - 2991316 TI - Gibbs sampling on bayesian models for biclustering microarray data AU - Sheng, Qizheng AU - Katholieke Universiteit Leuven PY - 2005 SN - 9056826565 PB - Heverlee Katholieke Universiteit Leuven. Faculteit Ingenieurswetenschappen DB - UniCat KW - 519.24 <043> KW - Academic collection KW - Special statistical applications and models--Dissertaties KW - Theses UR - https://www.unicat.be/uniCat?func=search&query=sysid:2991316 AB - Biclustering of microarray data is gaining increasing attention from researchers both in systems biology and in systems biomedicine. For systems biology, biclustering algorithms have the advantage of discovering genes that are coexpressed in a subset of (instead of all) the measured conditions, compared with conventional clustering methods. Since the emergence of web-based repositories of microarray data such as ArrayExpress and GEO, analysis based on microarray compendia where gene expression levels are measured under a large number of heterogeneous conditions has become more and more popular. Biclustering suits the needs for this type of analysis, especially for discovery of transcriptionalmodules, which provide essential clues for revealing genetic networks. For systems biomedicine, biclustering concerns the other orientation of microarray data, which is to cluster experiments (e.g., tumor samples) based on a subset ofgenes for each of which the experiments show consistent expression levels. The pattern of the target bicluster provides a gene expression fingerprint for the classification of the experiments. Therefore, the bicluster can help to reveal genes that are important for the pathology. In this thesis, we propose a biclustering strategy based on Bayesian modeling of microarray data and Gibbs sampling for the parameterization of the model. Bayesian models give ourmethod the advantage of incorporating prior knowledges so that the resulting bicluster can be directed towards answering the specific questions of the biologist, such as "what are the genes that are involved in this particular function, and what are the working conditions of the function?" In addition, Bayesian models also provides the base for the integration of information extracted from other data sources. Research in bioinformatics has seen growing awarenessthat data from different sources should not be studied in isolation. This awareness is calling out the need for tools that allow such integration to take place. Because of the high complexity of the biological process underlying a microarray data set, optimization methods for the clustering problems of microarray data often run into the problem of local maximum solutions. The corresponding clusters are often not interesting for the biologists, or often give an incomplete answer. Gibbs sampling is known for its ability to enhance the probability to discover the global maximum solutions. We consider this a favorable property for the study of microarray data. We provide severalcase studies to illustrate the efficiency of our strategy. ER -