Summary
See below for information on the mini-projects from the first week.
Submitting your mini-project solution
If you participated in the mini-projects in the first week and you have a solution to a problem that you’d like to share (maybe it’s coded well, maybe its found something new, or maybe nobody’s yet submitted a solution for that problem yet), then we encourage you to submit that solution as a pull request to the master branch of the workshop-content
repository.
Day 1: Regression
On the first day, Lee covered several regression exercises related to Isabell’s material in the morning. For this notebook, we used the red and white wine quality data sets. The Jupyter notebook is available here.
Day 2: Matrix Completion
On Tuesday afternoon, Aaron took us through a notebook on matrix completion with theoretical background and exercises.
Day 3: Neural networks
On Wednesday afternoon, Aaron took us through a sequence of notebooks on neural networks, which built on the morning’s material covered by Isabell. This afternoon featured keras
and tensorflow
, convolutional networks and the most elementary aspect of stream processing.
Day 4: Data wrangling
On Thursday afternoon, Roger Donaldson took us through his workflow in bash
and GUI-based spreadsheet programs to understand new data.
Day 5: pyspark
on AWS
On Friday afternoon, Aaron took us through a few examples in pyspark
related to the morning’s material on Software Tools for Data Science. Answers for the notebook are available here; the notebook with unanswered exercises is available here. A quick example of using tensorflow on a GPU was also showcased on this afternoon.