CHF100.00
Download est disponible immédiatement
This book addresses the difficulties experienced by wet lab researchers with the statistical analysis of molecular biology related data. The authors explain how to use R and Bioconductor for the analysis of experimental data in the field of molecular biology. The content is based upon two university courses for bioinformatics and experimental biology students (Biological Data Analysis with R and High-throughput Data Analysis with R). The material is divided into chapters based upon the experimental methods used in the laboratories. Key features include: Broad appeal--the authors target their material to researchers in several levels, ensuring that the basics are always covered. First book to explain how to use R and Bioconductor for the analysis of several types of experimental data in the field of molecular biology. * Focuses on R and Bioconductor, which are widely used for data analysis. One great benefit of R and Bioconductor is that there is a vast user community and very active discussion in place, in addition to the practice of sharing codes. Further, R is the platform for implementing new analysis approaches, therefore novel methods are available early for R users.
Auteur
Csaba Ortutay is a bioinformatician from Finland who has taught several bioinformatics courses at different European universities (Finland, Ireland, and Hungary) for over a decade. He is also active as a researcher publishing in the field of computational immunology.
Zsuzsanna Ortutay is a molecular immunologist at the University of Tampere, Finland, frequently utilizing diverse molecular lab methods.
Contenu
Foreword, xiii
Preface, xv
Acknowledgements, xix
About the Companion Website, xxi
1 Introduction to R statistical environment, 1
Why R?, 1
Installing R, 2
Interacting with R, 2
Graphical interfaces and integrated development environment (IDE) integration, 3
Scripting and sourcing, 3
The R history and the R environment file, 4
Packages and package repositories, 4
Comprehensive R Archive Network, 5
Bioconductor, 6
Working with data, 7
Basic operations in R, 8
Some basics of graphics in R, 10
Getting help in R, 12
Files for practicing, 13
Study exercises and questions, 14
References, 14
Webliography, 15
2 Simple sequence analysis, 17
Sequence files, 17
FASTA sequence format, 18
GenBank flat file format, 19
Reading sequence files into R, 20
Obtaining sequences from remote databases, 21
Seqinr package, 21
Ape package, 22
Descriptive statistics of nucleotide sequences, 24
Descriptive statistics of proteins, 28
Aligned sequences, 31
Visualization of genes and transcripts in a professional way, 34
Files for practicing, 37
Study exercises and questions, 38
References, 38
Webliography, 39
Packages, 40
3 Annotating gene groups, 41
Enrichment analysis: an overview, 41
Overview of two different methods, 41
Enrichment analysis results, 42
Common aspects of the two different approaches, 43
Overrepresentation analysis, 46
Hypergeometric test using GOstats, 47
ORA analysis using topGO, 48
Enrichment analysis of microarray sets with topGO, 51
Gene set enrichment analysis, 52
GSEA with R, 56
Files for practicing, 61
Study exercises and questions, 61
References, 62
Webliography, 62
Packages, 63
4 Next-generation sequencing: introduction and genomic applications, 65
High-throughput sequencing background, 65
Experimental background, 66
Single-end and paired-end sequencing reads, 67
Assemble reads, 69
How many reads? Depth of coverage, 71
Storing data in files, 72
FASTQ, 72
SAM and BAM files, 76
Variant call format files, 77
General data analysis workflow, 77
Data processing considerations, 78
Quality checking and screening read sequences, 80
Quality checking for one file, 82
Quality inspection for multiple files in a project, 82
Quality filtering of FASTQ files, 83
Handling alignment files and genomic variants, 84
Alignment and variation visualization, 88
Simple handling of VCF files, 89
Genomic applications: low- and medium-depth sequencing, 91
Aneuploidity sequencing and copy number variation identification, 92
SNP identification and validation, 92
Exome sequencing, 93
Genomic region resequencing, 93
Full genome and metagenome sequencing, 94
Files for practicing, 94
Study exercises and questions, 94
References, 95
Webliography, 97
Packages, 97
5 Quantitative transcriptomics: qRT-PCR, 99
Transcriptome, 99
Polymerase chain reaction, 100
Standards for qPCR, 102
R packages, 104
Understanding delta Ct, 104
Calculation of delta Ct, 105
Requirements for real delta Ct calculations, 107
Absolute quantification, 110
Value prediction, the professional way, 114
Relative quantification using the ddCt method, 115 Comparison of two conditions, 116&...