This page demonstrates Python tips and tricks that I use in my everyday programming as an atmospheric science graduate student.
-Brian Blaylock

Tuesday, April 7, 2015

Bootstrapping in Python

Python code for regression significance testing

#------------------------------------------
#Bootstrap test for significance
#------------------------------------------
import numpy as np
import matplotlib.pyplot as plt

x = compositeL # some numpy vector
y = compositeF # another numpy vector

# we will fill this vector with the regression coefficient as we find them
b1_vector = np.zeros(5000)

#bootstraping
for i in np.arange(5000):
     index = np.random.randint(0,len(x),len(x))
     #pull out the (x,y) pair for each index
     sample_x = x[index]
     sample_y = y[index]
     #calculate the b1 value and store it in the b_vector
     b_values = np.polyfit(sample_x,sample_y,1)
     #b1 is the first element result of polyfit 
     b1=b_values[0]
     b1_vector[i]=b1

plt.figure(6)

plt.hist(b1_vector,50)


In this example there is 99.96% confidence that the regression coefficient is positive.