Prof. Hong YanDepartment of Electronic Engineering, City University of Hong Kong
Detection of Coherent Patterns in Multidimensional Data
An important problem in "big data" analysis is to detect and classify meaningful patterns. We can perform data classification in either feature or object direction based on traditional clustering algorithms. However, if a coherent pattern embedded in the data involves a subset of features and a subset of objects, then biclustering analysis is needed, which is often more complicated than clustering. The problem is even more challenging if the data dimensionality is large. For example, in gene expression data, we may be interested in extracting a subset of genes that co-express under a subset of conditions at a subset of time points. In consumer data analysis, we may want to find a subset of consumers who like a subset of products in a subset of locations. In these two cases, we need to analyze three dimensional data arrays, or perform triclustering. Recently, we have discovered that a class of coherent patterns in multidimensional data can be represented as hyperplanes in singular vector spaces. By decomposing a data array into singular vector matrices, we can then deal with pattern coherence in individual directions. We have applied our coherent pattern detection algorithms to genomic data analysis, disease diagnosis and drug therapeutic effect assessment. Our method can also be useful for many other real world data mining and pattern recognition applications.
Hong Yan received his PhD degree from Yale University. He was Professor of Imaging Science at the University of Sydney and is currently Professor of Computer Engineering at City University of Hong Kong. His research interests include image processing, pattern recognition and bioinformatics, and he has over 300 journal and conference publications in these areas. Professor Yan was elected an IAPR fellow for contributions to document image analysis and an IEEE fellow for contributions to image recognition techniques and applications. He is currently an IEEE Distinguished Lecturer. Professor Yan's group is working on biomedical imaging and genomic data analysis. They have developed advanced signal processing and pattern recognition based techniques for DNA microarray data restoration, biclustering analysis and classification, and the prediction of protein-ligand, protein-DNA and protein-protein interactions. These methods have many useful applications to disease diagnosis, drug design and drug therapeutic effect assessment.
Back to Keynote Speakers