In this example, we create an XGBoost model that forecasts the total monthly vehicle sales in the USA.
We use Federal Reserve Economic Data (FRED) for our data, and
re-train our model at the end of each month. Lastly, we backtest the model
and visualize its predictions and rolling error from 2010 to the present day.
Source
We use FRED's API to fetch historic federal funds rate, unemployment rate, CPI, and vehicle sales data.
You can obtain an API key by signing up for a free account
on the FRED website. Then, we combine this data into a single OracleDataFrame
to fetch historical data on-the-fly during the backtest.
Define a function to fetch and prepare FRED data for a given series_id.
importdatetimefromioimportBytesIOimportpandasaspdimportrequestsfromanterior.sourceimportOracleDataFrame_api_key=...# Your API key heredeffetch_and_prepare_data(sid):url=f"https://api.stlouisfed.org/fred/series/observations?file_type=json" \
f"&api_key={_api_key}&series_id={sid}"df=pd.read_json(BytesIO(bytes(requests.get(url).text,'utf-8')),typ='series')["observations"]df=pd.DataFrame(df)df['date']=pd.to_datetime(df['date'])df.set_index('date',inplace=True)df[sid]=pd.to_numeric(df['value'],errors='coerce')returndf[[sid]]series_ids=["FEDFUNDS","UNRATE","CPIAUCSL","TOTALSA"]df=pd.concat([fetch_and_prepare_data(sid)forsidinseries_ids],axis=1,join='inner')df['month'],df['year']=df.index.month,df.index.yeardata=OracleDataFrame(df)
importdatetimefromioimportBytesIOimportpandasaspdimportrequestsfromanterior.sourceimportOracleDataFrame_api_key=...# Your API key heredeffetch_and_prepare_data(sid):url=f"https://api.stlouisfed.org/fred/series/observations?file_type=json" \
f"&api_key={_api_key}&series_id={sid}"df=pd.read_json(BytesIO(bytes(requests.get(url).text,'utf-8')),typ='series')["observations"]df=pd.DataFrame(df)df['date']=pd.to_datetime(df['date'])df.set_index('date',inplace=True)df[sid]=pd.to_numeric(df['value'],errors='coerce')returndf[[sid]]series_ids=["FEDFUNDS","UNRATE","CPIAUCSL","TOTALSA"]df=pd.concat([fetch_and_prepare_data(sid)forsidinseries_ids],axis=1,join='inner')df['month'],df['year']=df.index.month,df.index.yeardata=OracleDataFrame(df)
Warp
First, we declare our XGBoost model and initial variables. Then, we define a function to train the model with the latest data and log the model's performance, predictions and actual targets.
Lastly, using anterior's BackTester, we schedule the training function to run every month and backtest the model's
predictions and rolling error from 2010 to the present day.