Created 2 years ago
Sections: ● Top ● The Data ● Feature Engineering ● Investigating Correlation ● Lag Features ● Splitting ● The Model ● Results with Traditional Split ● Using Cross-Validation ● Making Future Predictions
Evan Marie online:
EvanMarie@Proton.me | Linked In | GitHub | Hugging Face | Mastadon |
Jovian.ai | TikTok | CodeWars | Discord ⇨ ✨ EvanMarie ✨#6114
from helpers import *
import_all()
from xgboost import XGBRegressor
%matplotlib inline
import seaborn as sns
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import TimeSeriesSplit
Sections: ● Top ● The Data ● Feature Engineering ● Investigating Correlation ● Lag Features ● Splitting ● The Model ● Results with Traditional Split ● Using Cross-Validation ● Making Future Predictions
The Data
- This data is an excerpt from a Kaggle maintained and regularly update dataset collection
- The dataset reflects the energy consumption as reported by the National Grid ESO, Great Britain's electricity system operator
- Consumption is recorded twice an hour
- The data covers January 1, 2009 to December 31, 2022
Importing Data
data = pd.read_csv('uk_power_consumption.csv', parse_dates = ['settlement_date'])
data = data[['settlement_date', 'tsd', 'is_holiday']]
data.columns = ['datetime', 'consumption', 'holiday']
data = data.set_index('datetime', drop=True)
head_tail_horz(data, 5, "UK Power Consumption Data", intraday = True)