Wind.dataset.Dataset
class Wind.dataset.Dataset(df: pd.DataFrame) [source]
Methods
fill_nan(self, fields: list) [source]
Fill missing values (NaN) in the specified fields/columns of the DataFrame.
Parameters: | fields (list) :
List of fields/columns to fill missing values. |
---|---|
Returns: | None :
|
drop_nan(self, fields: list) [source]
Drop columns in the specified fields/columns of the DataFrame.
Parameters: | fields (list) :
List of fields/columns to drop rows with NaN values. |
---|---|
Returns: | None :
|
sample(self, n: int) [source]
Sample every nth row from the DataFrame.
Parameters: | n (int) :
Sampling interval. |
---|---|
Returns: | None :
|
apply_rolling_window(self, df: pd.DataFrame, data: str, roll_time: int, window_function: callable) [source]
Apply a rolling window function to the specified data column in the DataFrame.
Parameters: | df (pd.DataFrame) :
DataFrame to which the rolling window function will be applied. data (str) :Column name containing the data to apply the rolling window function. roll_time (int) :Window size for the rolling window. window_function (callable) :Callable function to apply as the rolling window function. |
---|---|
Returns: | None :
|
add_last_t(self, df: pd.DataFrame, data: str, step: int=2) [source]
Add lagged versions of a column to the DataFrame.
Parameters: | df (pd.DataFrame) :
DataFrame to which the lagged columns will be added. data (str) :Column name to create lagged versions of. step (int, optional) :Number of lagged steps to add. Defaults to 2. |
---|---|
Returns: | None :
|
add_seasonal_feat(self, df: pd.DataFrame, time_col) [source]
Add seasonal features based on a time column.
Parameters: | df (pd.DataFrame) :
DataFrame to which the seasonal features will be added. time_col :Time column to extract seasonal features from. |
---|---|
Returns: | None :
|
create_dataset(self, df: pd.DataFrame, window_size: int, prediction_horizon: int, test_split: float=0.2, val_split: float=0.2, univariate: bool=False, target_col: str='active_power_total', shuffle: bool=False) [source]
Create a dataset for training and evaluation.
Parameters: | df (pd.DataFrame) :
Input DataFrame containing the data. window_size (int) :Size of the input window. prediction_horizon (int) :Number of steps to predict into the future. test_split (float, optional) :Ratio of test data split. Defaults to 0.2. val_split (float, optional) :Ratio of validation data split. Defaults to 0.2. univariate (bool, optional) :Flag indicating if the data is univariate. Defaults to False. target_col (str, optional) :Name of the target column. Defaults to "active_power_total". |
---|---|
Returns: | tuple : Tuple containing train,val and test data and labels, as well as feature names.
|