doc lagout org

Doc Lagout Org-PDF Download

  • Date:23 Feb 2020
  • Views:93
  • Downloads:2
  • Pages:336
  • Size:3.10 MB

Share Pdf : Doc Lagout Org

Download and Preview : Doc Lagout Org

Report CopyRight/DMCA Form For : Doc Lagout Org


Also in this series, Gregoris Mentzas Dimitris Apostolou Andreas Abecker and Ron Young. Knowledge Asset Management,1 85233 583 1, Michalis Vazirgiannis Maria Halkidi and Dimitrios Gunopulos. Uncertainty Handling and Quality Assessment in Data Mining. 1 85233 655 2, Asuncio n Go mez Pe rez Mariano Ferna ndez Lo pez and Oscar Corcho. Ontological Engineering,1 85233 551 3,Amo Scharil Ed. Environmental Online Communication,1 85233 783 4,Shichao Zhang Chengqi Zhang and Xindong Wu.
Knowledge Discovery in Multiple Databases,1 85233 703 6. Jason T L Wang Mohammed J Zaki,Hannu T T Toivonen and Dennis Shasha Eds. Data Mining in,Bioinformatics,With 110 Figures,Jason T L Wang PhD. New Jersey Institute of Technology USA,Mohammed J Zaki PhD. Computer Science Department Rensselaer Polytechnic Institute USA. Hannu T T Toivonen PhD,University of Helsinki and Nokia Research Center.
Dennis Shasha PhD,New York University USA,Series Editors. Xindong Wu,Lakhmi Jain,British Library Cataloguing in Publication Data. Data mining in bioinformatics Advanced information and. knowledge processing,1 Data mining 2 Bioinformatics Data processing. I Wang Jason T L,ISBN 1852336714, Library of Congress Cataloging in Publication Data. A catalogue record for this book is available from the American Library of Congress. Apart from any fair dealing for the purposes of research or private study or criticism or review as. permitted under the Copyright Designs and Patents Act 1988 this publication may only be repro. duced stored or transmitted in any form or by any means with the prior permission in writing of. the publishers or in the case of reprographic reproduction in accordance with the terms of licences. issued by the Copyright Licensing Agency Enquiries concerning reproduction outside those terms. should be sent to the publishers,AI KP ISSN 1610 3947.
ISBN 1 85233 671 4 Springer London Berlin Heidelberg. Springer Science Business Media,springeronline com. Springer Verlag London Limited 2005, The use of registered names trademarks etc in this publication does not imply even in the absence. of a specific statement that such names are exempt from the relevant laws and regulations and. therefore free for general use, The publisher makes no representation express or implied with regard to the accuracy of the infor. mation contained in this book and cannot accept any legal responsibility or liability for any errors. or omissions that may be made, Typesetting Electronic text files prepared by authors. Printed and bound in the United States of America, 34 3830 543210 Printed on acid free paper SPIN 10886107.
Contributors ix,Part I Overview 1,1 Introduction to Data Mining in Bioinformatics 3. 1 1 Background 3,1 2 Organization of the Book 4,1 3 Support on the Web 8. 2 Survey of Biodata Analysis from a Data Mining,Perspective 9. 2 1 Introduction 9, 2 2 Data Cleaning Data Preprocessing and Data Integration 12. 2 3 Exploration of Data Mining Tools for Biodata Analysis 16. 2 4 Discovery of Frequent Sequential and Structured Patterns 21. 2 5 Classi cation Methods 24,2 6 Cluster Analysis Methods 25.
2 7 Computational Modeling of Biological Networks 28. 2 8 Data Visualization and Visual Data Mining 31,2 9 Emerging Frontiers 35. 2 10 Conclusions 38,Part II Sequence and Structure Alignment 41. 3 AntiClustAl Multiple Sequence Alignment by Antipole. Clustering 43,3 1 Introduction 43,3 2 Related Work 45. 3 3 Antipole Tree Data Structure for Clustering 47. 3 4 AntiClustAl Multiple Sequence Alignment via Antipoles 48. 3 5 Comparing ClustalW and AntiClustAl 51,3 6 Case Study 53. 3 7 Conclusions 54,3 8 Future Developments and Research Problems 56.
vi Data Mining in Bioinformatics,4 RNA Structure Comparison and Alignment 59. 4 1 Introduction 59, 4 2 RNA Structure Comparison and Alignment Models 60. 4 3 Hardness Results 67, 4 4 Algorithms for RNA Secondary Structure Comparison 67. 4 5 Algorithms for RNA Structure Alignment 71,4 6 Some Experimental Results 76. Part III Biological Data Mining 83,5 Piecewise Constant Modeling of Sequential Data.
Using Reversible Jump Markov Chain Monte Carlo 85,5 1 Introduction 85. 5 2 Bayesian Approach and MCMC Methods 88,5 3 Examples 94. 5 4 Concluding Remarks 102,6 Gene Mapping by Pattern Discovery 105. 6 1 Introduction 105,6 2 Gene Mapping 106, 6 3 Haplotype Patterns as a Basis for Gene Mapping 110. 6 4 Instances of the Generalized Algorithm 117,6 5 Related Work 124.
6 6 Discussion 124,7 Predicting Protein Folding Pathways 127. 7 1 Introduction 127,7 2 Preliminaries 129,7 3 Predicting Folding Pathways 132. 7 4 Pathways for Other Proteins 137,7 5 Conclusions 141. 8 Data Mining Methods for a Systematics of Protein. Subcellular Location 143,8 1 Introduction 144,8 2 Methods 147. 8 3 Conclusion 186,9 Mining Chemical Compounds 189.
9 1 Introduction 189,9 2 Background 191,9 3 Related Research 193. 9 4 Classi cation Based on Frequent Subgraphs 196,9 5 Experimental Evaluation 204. 9 6 Conclusions and Directions for Future Research 213. Contents vii,Part IV Biological Data Management 217. 10 Phyloinformatics Toward a Phylogenetic Database 219. 10 1 Introduction 219,10 2 What Is a Phylogenetic Database For 222. 10 3 Taxonomy 224,10 4 Tree Space 229,10 5 Synthesizing Bigger Trees 230.
10 6 Visualizing Large Trees 234,10 7 Phylogenetic Queries 234. 10 8 Implementation 239,10 9 Prospects and Research Problems 240. 11 Declarative and E cient Querying on Protein,Secondary Structures 243. 11 1 Introduction 243,11 2 Protein Format 246,11 3 Query Language and Sample Queries 246. 11 4 Query Evaluation Techniques 248,11 5 Query Optimizer and Estimation 252.
11 6 Experimental Evaluation and Application of Periscope PS2 267. 11 7 Conclusions and Future Work 271, 12 Scalable Index Structures for Biological Data 275. 12 1 Introduction 275,12 2 Index Structure for Sequences 277. 12 3 Indexing Protein Structures 280, 12 4 Comparative and Integrative Analysis of Pathways 283. 12 5 Conclusion 295,Glossary 297,References 303,Biographies 327. Contributors,Peter Bajcsy Laurie Jane Hammel,Center for Supercomputing Department of Defense.
Applications USA,University of Illinois at,Urbana Champaign Jiawei Han. USA Department of Computer Science,University of Illinois at. Deb Bardhan Urbana Champaign,Department of Computer Science USA. Rensselaer Polytechnic Institute,USA Kai Huang,Department of Biological Sciences. Chris Bystro Carnegie Mellon University,Department of Biology USA.
Rensselaer Polytechnic Institute,USA Donald P Huddler. Biophysics Research Division,Mukund Deshpande University of Michigan. Oracle Corporation USA,George Karypis,Cinzia Di Pietro Department of Computer Science. School of Medicine and Engineering,University of Catania University of Minnesota. Alfredo Ferro Michihiro Kuramochi, Department of Mathematics and Department of Computer Science.
Computer Science and Engineering,University of Catania University of Minnesota. x Data Mining in Bioinformatics,Lei Liu Giuseppe Pigola. Center for Comparative Department of Mathematics and. and Functional Genomics Computer Science,University of Illinois at University of Catania. Urbana Champaign Italy,Alfredo Pulvirenti,Heikki Mannila Department of Mathematics and. Department of Computer Science Computer Science, Helsinki University of Technology University of Catania.
Finland Italy,Robert F Murphy Michele Purrello, Departments of Biological Sciences School of Medicine. and Biomedical Engineering University of Catania,Carnegie Mellon University Italy. Marco Ragusa,Vinay Nadimpally School of Medicine, Department of Computer Science University of Catania. Rensselaer Polytechnic Institute Italy,Marko Salmenkivi. Pa ivi Onkamo Department of Computer Science, Department of Computer Science University of Helsinki.
University of Helsinki Finland,Petteri Sevon,Roderic D M Page Department of Computer Science. Division of Environmental University of Helsinki,and Evolutionary Biology Finland. Institute of Biomedical and,Life Sciences Dennis Shasha. University of Glasgow Courant Institute of Mathematical. United Kingdom Sciences,New York University,Jignesh M Patel USA. Electrical Engineering and,Computer Science Department Ambuj K Singh.
University of Michigan Department of Computer Science. USA University of California at,Santa Barbara,Contributors xi. Hannu T T Toivonen Mohammed J Zaki, Department of Computer Science Department of Computer Science. University of Helsinki Rensselaer Polytechnic Institute. Finland USA,Jason T L Wang Kaizhong Zhang, Department of Computer Science Department of Computer Science. New Jersey Institute of Technology University of Western Ontario. USA Canada,Jiong Yang,Department of Computer Science. University of Illinois at,Urbana Champaign,Introduction to Data Mining in Bioinformatics.
Jason T L Wang Mohammed J Zaki,Hannu T T Toivonen and Dennis Shasha. The aim of this book is to introduce the reader to some of the best. techniques for data mining in bioinformatics in the hope that the reader. will build on them to make new discoveries on his or her own The. book contains twelve chapters in four parts namely overview sequence. and structure alignment biological data mining and biological data. management This chapter provides an introduction to the eld and. describes how the chapters in the book relate to one another. 1 1 Background, Bioinformatics is the science of managing mining integrating and. interpreting information from biological data at the genomic metabalomic. proteomic phylogenetic cellular or whole organism levels The need for. bioinformatics tools and expertise has increased as genome sequencing. projects have resulted in an exponential growth in complete and partial. sequence databases Even more data and complexity will result from. the interaction among genes that gives rise to multiprotein functionality. Assembling the tree of life is intended to construct the phylogeny for the. 1 7 million known species on earth These and other projects require the. development of new ways to interpret the ood of biological data that exists. today and that is anticipated in the future, Data mining or knowledge discovery from data KDD in its. most fundamental form is to extract interesting nontrivial implicit. previously unknown and potentially useful information from data 165 In. 4 Data Mining in Bioinformatics, bioinformatics this process could refer to nding motifs in sequences to. predict folding patterns to discover genetic mechanisms underlying a disease. to summarize clustering rules for multiple DNA or protein sequences and so. on With the substantial growth of biological data KDD will play a signi cant. role in analyzing the data and in solving emerging problems. The aim of this book is to introduce the reader to some of the best. techniques for data mining in bioinformatics BIOKDD in the hope that. the reader will build on them to make new discoveries on his or her own. This introductory chapter provides an overview of the work and how the. chapters in the book relate to one another We hope the reader nds the. book and the chapters as fascinating to read as we have found them to write. 1 2 Organization of the Book,This book is divided into four parts.
I Overview,II Sequence and Structure Alignment,III Biological Data Mining. IV Biological Data Management, Part I presents a primer on data mining for bioinformatics Part II. presents algorithms for sequence and structure alignment which are crucial. to e ective biological data mining and information retrieval Part III consists. of chapters dedicated to biological data mining with topics ranging from. genome modeling and gene mapping to protein and chemical mining Part IV. addresses closely related subjects focusing on querying and indexing methods. for biological data E cient indexing techniques can accelerate a mining. process thereby enhancing its overall performance Table 1 1 summarizes. the main theme of each chapter and the category it belongs to. 1 2 1 Part I Basics, In chapter 2 Peter Bajcsy Jiawei Han Lei Liu and Jiong Yang review. data mining methods for biological data analysis The authors rst present. methods for data cleaning data preprocessing and data integration Next. they show the applicability of data mining tools to the analysis of sequence. genome structure pathway and microarray gene expression data They. then present techniques for the discovery of frequent sequence and structure. patterns The authors also review methods for classi cation and clustering. in the context of microarrays and sequences and present approaches for the. computational modeling of biological networks Finally they highlight visual. data mining methods and conclude with a discussion of new research issues. such as text mining and systems biology,Introduction to Data Mining in Bioinformatics 5. Table 1 1 Main theme addressed in each chapter,Part I Overview.
Chapter 1 Introduction,Chapter 2 Survey,Part II Sequence and Structure Alignment. Also in this series Gregoris Mentzas Dimitris Apostolou Andreas Abecker and Ron Young Knowledge Asset Management 1 85233 583 1 Michalis Vazirgiannis Maria Halkidi

Related Books