Week 1: Mini-project Information

Summary

See below for information on the mini-projects from the first week.

Submitting your mini-project solution

If you participated in the mini-projects in the first week and you have a solution to a problem that you’d like to share (maybe it’s coded well, maybe its found something new, or maybe nobody’s yet submitted a solution for that problem yet), then we encourage you to submit that solution as a pull request to the master branch of the workshop-content repository.

Day 1: Regression

On the first day, Lee covered several regression exercises related to Isabell’s material in the morning. For this notebook, we used the red and white wine quality data sets. The Jupyter notebook is available here.

Day 2: Matrix Completion

On Tuesday afternoon, Aaron took us through a notebook on matrix completion with theoretical background and exercises.

Day 3: Neural networks

On Wednesday afternoon, Aaron took us through a sequence of notebooks on neural networks, which built on the morning’s material covered by Isabell. This afternoon featured keras and tensorflow, convolutional networks and the most elementary aspect of stream processing.

Day 4: Data wrangling

On Thursday afternoon, Roger Donaldson took us through his workflow in bash and GUI-based spreadsheet programs to understand new data.

Day 5: pyspark on AWS

On Friday afternoon, Aaron took us through a few examples in pyspark related to the morning’s material on Software Tools for Data Science. Answers for the notebook are available here; the notebook with unanswered exercises is available here. A quick example of using tensorflow on a GPU was also showcased on this afternoon.