Edit the jupyter/ipython notebook file Plotting_Homework.ipynb to perform the requested tasks, showing all your required code. When finished, download as an HTML file and submit the following file to Sakai:
Plotting_Homework.html
Same instructions as before for log.txt
: Every individual should separately submit a personal file log.txt
. This should be a plain text file with exactly this name. Your homework grade will not be recorded without this file! Include in the file:
If you worked with a consistent partner through the whole assignment, only one of you needs to submit the HTML file, but make sure each partner has their own copy! Each student should separately submit an independently written log.txt
file.
Within the ggplot
module, a small DataFrame called mpg
is included. Print out the first five rows of this DataFrame and the summary statistics describing the DataFrame.
A. Make a scatterplot comparing the displ
variable (x-axis) to the hwy
variable (y-axis). Label the x-axis "Engine displacement (liters)" and the y-axis "Highway miles per gallon".
B. Now make a new plot like the previous with the points colored according to the drv
variable.
C. Now split up the plot above into mini-plots, one for each class
variable.
D. Describe 3 trends you see in the data. For example, how does highway mpg between the compact and suv classes compare? Or what type of wheel drive do pickups tend to have?
Make a boxplot of the cty
variable according to class
. Which class has the highest median city mpg?
Using only the cars manufactured by Chevrolet, plot hwy
(y-axis) vs. year
(x-axis). Color the points according to model
and make them visibly larger than the default setting.
Which model and year has the highest highway mpg?
A. Make a multiple-histogram plot of hwy
faceted by trans
. Set the binwidth
to 2.
B. Edit the plot you just made so that it only includes the 4 trans
variables with the most cars. Also use scale_color_brewer()
or scale_color_manual()
to change the color of each histogram from the default settings.
C. Now make a density plot rather than a histogram using the same data from part B. Also, change the colors to a new palette you haven't yet used in this assignment.
D. Based on this data, does automatic or manual transmission tend to get better highway gas mileage? Explain your reasoning.