I’m beginning to do some scenario planning on what will data and statistical services offered by the library look like in 10-15 years. As part of that activity I’m compiling a list of articles, websites, & presentations that will help inform that perspective.
Many of these come from the article Teaching the next generation of statistics students to ‘think with data’: special issue on statistics and the undergraduate curriculum by Nicholas Horton and Johanna Hardin. That article has a nice section of key articles on statistics in the undergraduate curriculum, from which I’ve made some selections below.
Setting the stage for data science: integration of data management skills in introductory and second courses in statistics. Horton, Baumer, Wickham. 2015. (pdf)
Identifies 5 key elements that deserve greater emphasis in the undergrad curriculum:
- “Thinking creatively, but constructively, about data”…data cleaning, data storage
- working with data sets of varying sizes and understanding scalability issues…querying databases
- command-line skills. The authors mention R, Python. I also would include Unix. The command-driven environment “provide freedom from the un-reproducible point-and-click application paradigm”.
- “Experience wrestling with large, messy, complex, challenging data sets…these data are more similar to what analysts actually see in the wild.”
- “An ethos of reproducibility”
The article goes onto illustrate examples of utilizing these 5 elements in coursework.
Tidy Data – slides of presentation by Wickham
Implications of the Data Revolution for Statistics Education (pdf) 2015 calls for more emphasis on big data, data visualization, and developing an “aesthetic for data handling and modeling based on solving practical problems”.