End Of Course Reflections

I am just wrapping up a course (ST558 at NCSU) that covers advanced R programming topics, like parallelization, Shiny, and containerization. I started this blog for the course and as my final assignment am making this reflection post.

  • What has changed about what I think a data scientist is and does?
    • About a month ago, I made a blog post describing my views on data science. My views remained unchanged. Data science, to me, is just applied statistics in industry. There are non-statistical duties to the job, data collection and wrangling, possibly some software development, but the core of data science is making data-based inferences and predictions. This is the domain of statistics.
  • What are my current thoughts in terms of using R for data science and will I continue to use R going forward?
    • I used to think of R as a good language for prototyping analyses and models, but I will be using it for much more going forward. I still think R’s speed is limiting and other languages are better for production code, but R deserves more credit than I gave it previously. Between Rmarkdown, Shiny, and ggplot2, R may be the best tool for communicating data analyses.
  • What things will I do differently in practice after this course?
    • I started programming in 2016 with MATLAB and have been using Python as long, so I was used to good programming practices, functional programming, etc. I will change some practices though. Going forward, I will make use of automating Rmarkdown reporting and Shiny for scaling and sharing analyses. I have already started using R’s optimized looping functions, vectorization, and parallelization for speeding up operations that are amenable to those techniques.
Written on August 3, 2021