blog, Coaching, data, data engineering, data science, portfolio, python, Statistics

Project:- Data Analysis of Wine Quality, in Python

The Project

The project is part of the Udacity Data Analysis Nanodegree. The section of the course is a Case Study on wine quality, using the UCI Wine Quality Data Set: https://archive.ics.uci.edu/ml/datasets/Wine+Quality

The Case Study introduces us to several new concepts which we can apply to the data set which will allow us to analyse several attributes and ascertain what qualities of wine correspond to highly rated wines.

I downloaded the data from the above link. I then imported the data into Python so we could use a Jupyter Notebook to create the required report, which allows us to document and code in the same document, great for presenting back findings and visualisations from the data.

I structured the project similarly to the CRISP-DM method – that is I i. Stated the objectives, ii. Decided what questions to ask of the data, iii. Carried out tasks to understand the data, iv. Performed Data Wrangling and Exploratory Data Analysis and then drew conclusions and answered the questions posed.

The PDF report written to communicate my project and findings can also be found here

What We Learned

  • Using Hist and plot() to build Histograms visualisations
  • Using plotting.scatter_matrix and plot() to build scatter plot visualisations
  • Changing the figsize of a chart to a more readable format, and adding a ‘;’ to the end of the line to remove unwanted text
  • Appending data frames together in Pandas
  • Renaming data frame Columns in Pandas
  • Using GroupBy and Query in Pandas to aggregate and group selections of data
  • Creating Bar charts in matplotlib and using Seaborn to add better formating
  • Adding appropriate labels, titles , colour
  • Engineering proportionality in the data that allows data sets be compared more easily

The Code and the Report

  • GitHub repository for the data, SQL, PDF report and Jupyter Notebook
  • the PDF report can also be found here

References



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s