{ "cells": [ { "cell_type": "markdown", "id": "bc6f073d-3176-4aac-9e59-171eafb0733b", "metadata": {}, "source": [ "# Lab 3-2: Rank-Sum Test Example\n", "---" ] }, { "cell_type": "markdown", "id": "02ac0f40-5029-49f4-a033-5a3e527df715", "metadata": {}, "source": [ "Note that the [scipy.stats.ranksums](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ranksums.html) function should only be used to compare two samples from continuous distributions (i.e. theoretical distributions). It does not handle ties between measurements nor does it apply the continuity correction. For a rank sum test function that can compare discontinuous distribution (i.e. empirical data), handle ties, and apply a continuity correction, use [scipy.stats.mannwhitneyu](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html)." ] }, { "cell_type": "code", "execution_count": 1, "id": "e053f7b5-85f6-48ba-9c2d-0c754c210cb2", "metadata": {}, "outputs": [], "source": [ "# import libraries we'll need\n", "import pandas as pd\n", "import numpy as np\n", "import scipy.stats as stats\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "id": "ddc5986a-03e3-43a9-89bf-a278188319e5", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.10/site-packages/openpyxl/worksheet/_read_only.py:79: UserWarning: Unknown extension is not supported and will be removed\n", " for idx, row in parser.parse():\n" ] }, { "data": { "text/html": [ "
\n", " | date of peak | \n", "water year | \n", "peak value (cfs) | \n", "gage_ht (feet) | \n", "
---|---|---|---|---|
0 | \n", "1928-10-09 | \n", "1929 | \n", "18800 | \n", "10.55 | \n", "
1 | \n", "1930-02-05 | \n", "1930 | \n", "15800 | \n", "10.44 | \n", "
2 | \n", "1931-01-28 | \n", "1931 | \n", "35100 | \n", "14.08 | \n", "