{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Graphical Data Analysis\n", "\n", "When you get a set of measurements, ask yourself:\n", "- What do you want to learn from this data?\n", "- What is your hypothesis, and what would it look like if the data supports or does not support your hypothesis?\n", "- **Plot your data!** (and always label your plots clearly)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reporting of Numbers\n", "\n", "- Keep track of **units**, and always report units with your numbers!\n", " - Make sure to check metadata about how the measurements were made\n", "- Significant figures\n", " - From our snow depth example last week:\n", " - Should I report a snow depth value of 20.3521 cm?\n", " - Should I report a snow depth value of 2035 mm?\n", " - Should I report a snow depth value of 20.0000 cm?\n", " - Consider the certainty with which you know a value. Don't include any more precision beyond that\n", " - Note: Rounding errors - Allow the computer to include full precision for intermediate calculations, round to significant figures for the final result of the computation that you report in the answer\n", " \n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To start, we will import some python packages:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# numpy has a lot of math and statistics functions we'll need to use\n", "import numpy as np\n", "\n", " # pandas gives us a way to work with and plot tabular datasets easily (called \"dataframes\")\n", "import pandas as pd\n", "\n", "# we'll use matplotlib for plotting here (it works behind the scenes in pandas)\n", "import matplotlib.pyplot as plt \n", "\n", "# tell jupyter to make out plots \"inline\" in the notbeook\n", "%matplotlib inline " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why are you plotting?\n", "\n", "**You have an application in mind with your data.** This application should inform your choice of analysis technique, what you want to plot and visualize." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Open our [file](https://mountain-hydrology-research-group.github.io/data-analysis/_downloads/55754612f1cf8f2d340fa84ba0f399b4/my_data.csv) using the pandas [read_csv function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Use pandas.read_csv() function to open this file.\n", "# This stores the data in a \"Data Frame\"\n", "my_data = pd.read_csv('my_data.csv')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | time | \n", "tair_max | \n", "tair_min | \n", "cumulative_precip | \n", "
---|---|---|---|---|
0 | \n", "1920-12-31 | \n", "20.455167 | \n", "-9.901765 | \n", "102.502512 | \n", "
1 | \n", "1921-12-31 | \n", "20.119887 | \n", "-10.364254 | \n", "97.108113 | \n", "
2 | \n", "1922-12-31 | \n", "19.872675 | \n", "-10.313181 | \n", "97.166797 | \n", "
3 | \n", "1923-12-31 | \n", "20.449070 | \n", "-11.359639 | \n", "97.902843 | \n", "
4 | \n", "1924-12-31 | \n", "20.449110 | \n", "-10.046539 | \n", "99.329978 | \n", "