M5 Accuracy - Preprocessed Datasets

老下头

M5 Accuracy - Preprocessed Datasets

pcalstm

￥15

已售 0

1.06GB

数据标识：D17173861802545145

发布时间：2024/06/03

About the data

As many people (me included) were having memory issues while using data from the M5 Forecasting - Accuracy competition, I've decided to preprocess them a few in order to try to solve some of these issues.

So, in the end, I've developed three datasets:

processed_df: I got the merged data from Ryuhei's kernel and added a date_id column. That's all. This is not meant to be loaded on Kaggle kernels, as it needs more memory than the kernels can handle. I've uploaded it here for anyone who wants to use it for its personal datasets;
lstm_df: I've reshaped the processed_df in order to keep date_id, id and value columns data only. My intention is to use this dataset for LSTM and that's the reason I chose this name;
dimred_df: After applying PCA to the processed_df, I realized that, by using only the first two principal componentes, 99.9% of the explained variance was retained. By knowing that, this dataset contains only date_id, id, value and the two principal components columns.

Acknowledgements

Thanks Ryuhei F. for the amazing kernel from which I've developed these datasets.

看了又看

验证报告

以下为卖家选择提供的数据验证报告：

M5 Accuracy - Preprocessed Datasets

￥15

已售 0

1.06GB

申请报告

M5 Accuracy - Preprocessed Datasets

About the data

Acknowledgements

关于典枢

下载与支持

服务协议

关于我们

官方公众号

技术交流群