The Resource Beginning data science in R : data analysis, visualization, and modelling for the data scientist, Thomas Mailund
Beginning data science in R : data analysis, visualization, and modelling for the data scientist, Thomas Mailund
 Summary
 Discover best practices for data analysis and software development in R and start on the path to becoming a fullyfledged data scientist. This book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Data Science in R details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this. This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming. You will: Perform data science and analytics using statistics and the R programming language Visualize and explore data, including working with large data sets found in big data Build an R package Test and check your code Practice version control Profile and optimize your code
 Language
 eng
 Extent
 1 online resource
 Note
 Includes index
 Contents

 At a Glance; Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Introduction to R Programming; Basic Interaction with R; Using R as a Calculator; Simple Expressions; Assignments; Actually, All of the Above Are Vectors of Values ... ; Indexing Vectors; Vectorized Expressions; Comments; Functions; Getting Documentation for Functions; Writing Your Own Functions; Vectorized Expressions and Functions; A Quick Look at Control Structures; Factors; Data Frames; Dealing with Missing Values; Using R Packages
 Controlling the Output (Templates/Stylesheets)Running R Code in Markdown Documents; Using Chunks when Analyzing Data (Without Compiling Documents); Caching Results; Displaying Data; Exercises; Create an R Markdown Document; Produce Different Output; Add Caching; Chapter 3: Data Manipulation; Data Already in R; Quickly Reviewing Data; Reading Data; Examples of Reading and Formatting Datasets; Breast Cancer Dataset; Boston Housing Dataset; The readr Package; Manipulating Data with dplyr; Some Useful dplyr Functions; select(): Pick Selected Columns and Get Rid of the Rest
 Data Pipelines (or Pointless Programming)Writing Pipelines of Function Calls; Writing Functions that Work with Pipelines; The magical "." argument; Defining Functions Using .; Anonymous Functions; Other Pipeline Operations; Coding and Naming Conventions; Exercises; Mean of Positive Values; Root Mean Square Error; Chapter 2: Reproducible Analysis; Literate Programming and Integration of Workflow and Documentation; Creating an R Markdown/knitr Document in RStudio; The YAML Language; The Markdown Language; Formatting Text; CrossReferencing; Bibliographies
 FacetsScaling; Themes and Other Graphics Transformations; Figures with Multiple Plots; Exercises; Chapter 5: Working with Large Datasets; Subsample Your Data Before You Analyze the Full Dataset; Running Out of Memory During Analysis; Too Large to Plot; Too Slow to Analyze; Too Large to Load; Exercises; Subsampling; Hex and 2D Density Plots; Chapter 6: Supervised Learning; Machine Learning; Supervised Learning; Regression versus Classification; Inference versus Prediction; Specifying Models; Linear Regression; Logistic Regression (Classification, Really); Model Matrices and Formula
 Mutate():Add Computed Values to Your Data FrameTransmute(): Add Computed Values to Your Data Frame and Get Rid of All Other Columns; arrange(): Reorder Your Data Frame by Sorting Columns; filter(): Pick Selected Rows and Get Rid of the Rest; group_by(): Split Your Data Into Subtables Based on Column Values; summarise/summarize(): Calculate Summary Statistics; Breast Cancer Data Manipulation; Tidying Data with tidyr; Exercises; Importing Data; Using dplyr; Using tidyr; Chapter 4: Visualizing Data; Basic Graphics; The Grammar of Graphics and the ggplot2 Package; Using qplot(); Using Geometries
 Isbn
 9781484226711
 Label
 Beginning data science in R : data analysis, visualization, and modelling for the data scientist
 Title
 Beginning data science in R
 Title remainder
 data analysis, visualization, and modelling for the data scientist
 Statement of responsibility
 Thomas Mailund
 Language
 eng
 Summary
 Discover best practices for data analysis and software development in R and start on the path to becoming a fullyfledged data scientist. This book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Data Science in R details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this. This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming. You will: Perform data science and analytics using statistics and the R programming language Visualize and explore data, including working with large data sets found in big data Build an R package Test and check your code Practice version control Profile and optimize your code
 Cataloging source
 N$T
 Dewey number
 001.42
 Index
 index present
 LC call number
 Q180.55.Q36
 Literary form
 non fiction
 Nature of contents
 dictionaries
 Label
 Beginning data science in R : data analysis, visualization, and modelling for the data scientist, Thomas Mailund
 Note
 Includes index
 Antecedent source
 unknown
 http://library.link/vocab/branchCode

 net
 Carrier category
 online resource
 Carrier category code
 cr
 Carrier MARC source
 rdacarrier
 Color
 multicolored
 Content category
 text
 Content type code
 txt
 Content type MARC source
 rdacontent
 Contents

 At a Glance; Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Introduction to R Programming; Basic Interaction with R; Using R as a Calculator; Simple Expressions; Assignments; Actually, All of the Above Are Vectors of Values ... ; Indexing Vectors; Vectorized Expressions; Comments; Functions; Getting Documentation for Functions; Writing Your Own Functions; Vectorized Expressions and Functions; A Quick Look at Control Structures; Factors; Data Frames; Dealing with Missing Values; Using R Packages
 Controlling the Output (Templates/Stylesheets)Running R Code in Markdown Documents; Using Chunks when Analyzing Data (Without Compiling Documents); Caching Results; Displaying Data; Exercises; Create an R Markdown Document; Produce Different Output; Add Caching; Chapter 3: Data Manipulation; Data Already in R; Quickly Reviewing Data; Reading Data; Examples of Reading and Formatting Datasets; Breast Cancer Dataset; Boston Housing Dataset; The readr Package; Manipulating Data with dplyr; Some Useful dplyr Functions; select(): Pick Selected Columns and Get Rid of the Rest
 Data Pipelines (or Pointless Programming)Writing Pipelines of Function Calls; Writing Functions that Work with Pipelines; The magical "." argument; Defining Functions Using .; Anonymous Functions; Other Pipeline Operations; Coding and Naming Conventions; Exercises; Mean of Positive Values; Root Mean Square Error; Chapter 2: Reproducible Analysis; Literate Programming and Integration of Workflow and Documentation; Creating an R Markdown/knitr Document in RStudio; The YAML Language; The Markdown Language; Formatting Text; CrossReferencing; Bibliographies
 FacetsScaling; Themes and Other Graphics Transformations; Figures with Multiple Plots; Exercises; Chapter 5: Working with Large Datasets; Subsample Your Data Before You Analyze the Full Dataset; Running Out of Memory During Analysis; Too Large to Plot; Too Slow to Analyze; Too Large to Load; Exercises; Subsampling; Hex and 2D Density Plots; Chapter 6: Supervised Learning; Machine Learning; Supervised Learning; Regression versus Classification; Inference versus Prediction; Specifying Models; Linear Regression; Logistic Regression (Classification, Really); Model Matrices and Formula
 Mutate():Add Computed Values to Your Data FrameTransmute(): Add Computed Values to Your Data Frame and Get Rid of All Other Columns; arrange(): Reorder Your Data Frame by Sorting Columns; filter(): Pick Selected Rows and Get Rid of the Rest; group_by(): Split Your Data Into Subtables Based on Column Values; summarise/summarize(): Calculate Summary Statistics; Breast Cancer Data Manipulation; Tidying Data with tidyr; Exercises; Importing Data; Using dplyr; Using tidyr; Chapter 4: Visualizing Data; Basic Graphics; The Grammar of Graphics and the ggplot2 Package; Using qplot(); Using Geometries
 Control code
 ocn975486855
 Dimensions
 unknown
 Extent
 1 online resource
 File format
 unknown
 Form of item
 online
 Isbn
 9781484226711
 Media category
 computer
 Media MARC source
 rdamedia
 Media type code
 c
 Other control number
 10.1007/9781484226711
 http://library.link/vocab/ext/overdrive/overdriveId
 cl0500000849
 Quality assurance targets
 unknown
 http://library.link/vocab/recordID
 .b37442557
 Sound
 unknown sound
 Specific material designation
 remote
 System control number

 (OCoLC)975486855
 safari1484226712
Embed (Experimental)
Settings
Select options that apply then copy and paste the RDF/HTML data fragment to include in your application
Embed this data in a secure (HTTPS) page:
Layout options:
Include data citation:
<div class="citation" vocab="http://schema.org/"><i class="fa faexternallinksquare fafw"></i> Data from <span resource="http://link.library.deakin.edu.au/portal/BeginningdatascienceinRdataanalysis/wKkd1TkbcsU/" typeof="CreativeWork http://bibfra.me/vocab/lite/Item"><span property="name http://bibfra.me/vocab/lite/label"><a href="http://link.library.deakin.edu.au/portal/BeginningdatascienceinRdataanalysis/wKkd1TkbcsU/">Beginning data science in R : data analysis, visualization, and modelling for the data scientist, Thomas Mailund</a></span>  <span property="offers" typeOf="Offer"><span property="offeredBy" typeof="Library ll:Library" resource="http://link.library.deakin.edu.au/#Deakin%20University%20Library"><span property="name http://bibfra.me/vocab/lite/label"><a property="url" href="http://link.library.deakin.edu.au/">Deakin University Library</a></span></span></span></span></div>
Note: Adjust the width and height settings defined in the RDF/HTML code fragment to best match your requirements
Preview
Cite Data  Experimental
Data Citation of the Item Beginning data science in R : data analysis, visualization, and modelling for the data scientist, Thomas Mailund
Copy and paste the following RDF/HTML data fragment to cite this resource
<div class="citation" vocab="http://schema.org/"><i class="fa faexternallinksquare fafw"></i> Data from <span resource="http://link.library.deakin.edu.au/portal/BeginningdatascienceinRdataanalysis/wKkd1TkbcsU/" typeof="CreativeWork http://bibfra.me/vocab/lite/Item"><span property="name http://bibfra.me/vocab/lite/label"><a href="http://link.library.deakin.edu.au/portal/BeginningdatascienceinRdataanalysis/wKkd1TkbcsU/">Beginning data science in R : data analysis, visualization, and modelling for the data scientist, Thomas Mailund</a></span>  <span property="offers" typeOf="Offer"><span property="offeredBy" typeof="Library ll:Library" resource="http://link.library.deakin.edu.au/#Deakin%20University%20Library"><span property="name http://bibfra.me/vocab/lite/label"><a property="url" href="http://link.library.deakin.edu.au/">Deakin University Library</a></span></span></span></span></div>