All about my experience at the Metis Data Science Bootcamp
The last few weeks of the program are heavily devoted to final project work, review, and preparing materials for the job hunt.
For my final project I decided to tackle creating a Quantified Self data aggregation platform. The portion I completed for career day was, in reality, more of a data engineering project with a sprinkle of modeling. I wanted to get a bit of experience using cloud infrastructure and an ETL process manager. The project included pulling data from a variety of sources including Fitbit, Moves, Mint, Chrome, social media sites, and also contextual data such as the weather and sunrise/sunset times. The goal of the system was to automate pulling this data, aggregate it, and prepare it for analysis in near real-time.
The architecture of the system included a Postgres data store on the backend which holds all of the raw data. On the front-end I built a Rails app that manages user information and hosts the UI. For a process manager I decided to use Airflow. Airflow is tool created by the team at AirBnb. Just this past year it was picked-up as a project by Apache and is now receiving more attention. The tool, though essentially a Flask app (which I've become quite familiar with), was a challenge to work with as the documentation is currently lacking. I found it's approach to be useful for the tasks I had to manage. Here is a brief 'How To' I put together to document how I set-up Airflow for future reference.
I built my presentation for Metis Career Day in the form of a marketing website. This can be found here.