Plotting Homework

60 points

Edit the jupyter/ipython notebook file Plotting_Homework.ipynb to perform the requested tasks, showing all your required code. When finished, download as an HTML file and submit the following file to Sakai:

Plotting_Homework.html

Same instructions as before for log.txt: Every individual should separately submit a personal file log.txt. This should be a plain text file with exactly this name. Your homework grade will not be recorded without this file! Include in the file:

  1. Roughly how long you worked on this assignment beyond class time.
  2. Briefly, how it went for you, for instance what was the hardest part to get ...
  3. Who your partner was if you had one, or "No partner".
  4. If you had a partner, give an indication of how things went with your partner. Was working together a good thing?

If you worked with a consistent partner through the whole assignment, only one of you needs to submit the HTML file, but make sure each partner has their own copy! Each student should separately submit an independently written log.txt file.

Problem 1. (6 pts)

Within the ggplot module, a small DataFrame called mpg is included. Print out the first five rows of this DataFrame and the summary statistics describing the DataFrame.

In [ ]:
 

Problem 2. (20 pts)

A. Make a scatterplot comparing the displ variable (x-axis) to the hwy variable (y-axis). Label the x-axis "Engine displacement (liters)" and the y-axis "Highway miles per gallon".

In [ ]:
 

B. Now make a new plot like the previous with the points colored according to the drv variable.

In [ ]:
 

C. Now split up the plot above into mini-plots, one for each class variable.

In [ ]:
 

D. Describe 3 trends you see in the data. For example, how does highway mpg between the compact and suv classes compare? Or what type of wheel drive do pickups tend to have?

In [ ]:
 

Problem 3. (6 pts)

Make a boxplot of the cty variable according to class. Which class has the highest median city mpg?

In [ ]:
 

Problem 4. (8 pts)

Using only the cars manufactured by Chevrolet, plot hwy (y-axis) vs. year (x-axis). Color the points according to model and make them visibly larger than the default setting.

Which model and year has the highest highway mpg?

In [ ]:
 

Problem 5. (20 pts)

A. Make a multiple-histogram plot of hwy faceted by trans. Set the binwidth to 2.

In [ ]:
 

B. Edit the plot you just made so that it only includes the 4 trans variables with the most cars. Also use scale_color_brewer() or scale_color_manual() to change the color of each histogram from the default settings.

In [ ]:
 

C. Now make a density plot rather than a histogram using the same data from part B. Also, change the colors to a new palette you haven't yet used in this assignment.

In [ ]:
 

D. Based on this data, does automatic or manual transmission tend to get better highway gas mileage? Explain your reasoning.

In [ ]: