Reduce API calls and process time- or How I learned to pickle my quandl

I've written about Quandl before and the wealth of information they provide in an easy to process way. However their generosity does know limits. While it's difficult to brush up against their daily call limits, I might as well do what I can to be respectful of their service and my efficiency.

Introducing another Python package to discuss- Pickle. In short pickle allows python objects to be converted to streams and stored, then later accessed. The concept behind this is serialization. This also would allow objects to be transmitted to sockets and reproduced accurately, although pickle versioning is important. Pickle does not provide any form of security, all pickles imported are assumed to be in working order and not tampered with.

Using pickle allows me to store my python dataframes and open them from their binary rather than make additional calls to quandl. The most immediate benefit is speed. This program from a previous article is now run entirely offline.

Full Program here

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Effective Federal Funds Rate over time with economic recessions
@author: wolf
"""
import pickle
import quandl
import matplotlib.pyplot as plt
from matplotlib import style

style.use('ggplot')

quandl.ApiConfig.api_key = 'your-api-key-here'

#-----#Uncomment following block to pull and create updated dataframes

#Effective Federal Funds Rate
df2 = quandl.get("FED/RIFSPFF_N_WW", collapse="monthly", start_date="1955-01-01")
output = open("EffFundsRate.pickle", 'wb')
pickle.dump(df2, output, pickle.HIGHEST_PROTOCOL)
output.close()

#Recession, 1=recession
df4 = quandl.get("FRED/USRECP", start_date="1955-01-01")
output = open("FREDrecession.pickle", 'wb')
pickle.dump(df4, output, pickle.HIGHEST_PROTOCOL)
output.close()

#------#Uncomment following block to load stored dataframes

pickle_in = open('EffFundsRate.pickle', 'rb')
df2 = pickle.load(pickle_in)
pickle_in.close()

pickle_in = open('FREDrecession.pickle', 'rb')
df4 = pickle.load(pickle_in)
pickle_in.close()

#------#

df2['Value'].plot()

maxY = df2['Value'].max()
y2 = df4['Value']
plt.fill_between(df4.index, 0, maxY, where=y2 == 1, facecolor='blue', alpha=0.2)

plt.xlabel('Year')
plt.title('Federal Funds effective rate with recession times shaded')
plt.show()

Note: I swapped my unique API key with a placeholder, you can run this code by deleting that line to access Quandl as an unregistered user. As an unregistered user you have lower call limits.

Key points

For now the source is a manual choice of commenting one block or the other. Both blocks run uncommented works but this is a complete reset, as it pulls fresh data and then overwrites the old pickle files.

To run a fresh pull from quandl and write new pickle files the first block is run:

#-----#Uncomment following block to pull and create updated dataframes

#Effective Federal Funds Rate
df2 = quandl.get("FED/RIFSPFF_N_WW", collapse="monthly", start_date="1955-01-01")
output = open("EffFundsRate.pickle", 'wb')
pickle.dump(df2, output, pickle.HIGHEST_PROTOCOL)
output.close()

#Recession, 1=recession
df4 = quandl.get("FRED/USRECP", start_date="1955-01-01")
output = open("FREDrecession.pickle", 'wb')
pickle.dump(df4, output, pickle.HIGHEST_PROTOCOL)
output.close()

Once this has been performed it doesn't need to be rerun unless the source data has updated. (In this case I am pulling weekly data updated every Wednesday and collapsed by month.)

After commenting out the previous block I can now uncomment the following:

#------#Uncomment following block to load stored dataframes

pickle_in = open('EffFundsRate.pickle', 'rb')
df2 = pickle.load(pickle_in)
pickle_in.close()

pickle_in = open('FREDrecession.pickle', 'rb')
df4 = pickle.load(pickle_in)
pickle_in.close()

#------#

This block is now all I need to run to import my stored pickle data as a python object, as originally stored.

Now I can enjoy fast 'free' data that is stored offline.

Improvements: The use cases I've seen all used this commenting/uncommenting strategy, but this is a choice that would have to be made by a programmer. To make this more user friendly I could have a user prompt at the beginning when run on a command line or window asking, "Would you like to refresh to latest source data?" Otherwise you could be at risk of using outdated data when quandl was chosen for its up to the minute accuracy.

Article: "Reduce API calls and process time- or How I learned to pickle my quandl" by Wolf, in Software

Comments

There are no comments yet.

Nothing but Wolf ~

Reduce API calls and process time- or How I learned to pickle my quandl

Full Program here

Key points

Comments

Related content