Lab 7-2: Markov Chains - ENSO Phases#
Download the data file for this lab, ENSO_to2024.csv, which contains a record of the El Niño Southern Oscillation (ENSO) phase from 1900-2024.
You can read more about ENSO here, and here.
Importing python packages you’ll need for this lab:
import pandas as pd
import numpy as np
import scipy.stats as stats
from scipy import sparse
import matplotlib.pyplot as plt
%matplotlib inline
Load the data file
df = pd.read_csv('../data/ENSO_to2022.csv', comment='#')
df.head(3)
Water Year | ENSO Phase | Unnamed: 2 | |
---|---|---|---|
0 | 1900 | 1 | NaN |
1 | 1901 | 2 | NaN |
2 | 1902 | 2 | NaN |
A. Using the time series of the phase of the El Niño Southern Oscillation (ENSO) from 1900-2022, create a lag-1 Markov model of the ENSO phase.
Observed Phases of ENSO:
1: warm (El Niño)
2: neutral (ENSO neutral)
3: cool, (La Niña)
Count transitions between each of the three ENSO phases using scipy.sparse.csr_matrix() and then scipy.sparse.csr_matrix.todense().
# count the transitions from each state to the next
# convert transition counts to matrix form
Normalize the transition matrix to get probabilities. This will create our lag-1 Markov Model.
Compute cumulative sums along the rows, make sure these sum to 1. (We will use this cdf matrix below in a simulation of ENSO phases)
B. Using this Markov model and a random number generator, simulate 5,000 years of ENSO data.
# pick the number of years we want to simulate (5000)
# use a uniform random number for 5000 years
# start off in state 2, neutral
C. Using this randomly generated data, answer the following questions.
According to the model, what is the probability that three warm ENSO years would occur in a row?
What is the large-sample probability that three cool ENSO years would happen in a row?
(Try refreshing the numbers several times to increase the sample size if the condition never happens.)