Scholarship & Research

Permanent URI for this communityhttps://scholarworks.montana.edu/handle/1/1

Browse

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Item
    Advancing Retail Predictions: Integrating Diverse Machine Learning Models for Accurate Walmart Sales Forecasting
    (Sciencedomain International, 2024-06) C., Cyril Neba; F., Gerard Shu; Nsuh, Gillian; A., Philip Amouda; F.. Adrian Neba; Webnda, F.; Ikpe, Victory; Orelaja, Adeyinka; Sylla, Nabintou Anissia
    In the rapidly evolving landscape of retail analytics, the accurate prediction of sales figures holds paramount importance for informed decision-making and operational optimization. Leveraging diverse machine learning methodologies, this study aims to enhance the precision of Walmart sales forecasting, utilizing a comprehensive dataset sourced from Kaggle. Exploratory data analysis reveals intricate patterns and temporal dependencies within the data, prompting the adoption of advanced predictive modeling techniques. Through the implementation of linear regression, ensemble methods such as Random Forest, Gradient Boosting Machines (GBM), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), this research endeavors to identify the most effective approach for predicting Walmart sales. Comparative analysis of model performance showcases the superiority of advanced machine learning algorithms over traditional linear models. The results indicate that XGBoost emerges as the optimal predictor for sales forecasting, boasting the lowest Mean Absolute Error (MAE) of 1226.471, Root Mean Squared Error (RMSE) of 1700.981, and an exceptionally high R-squared value of 0.9999900, indicating near-perfect predictive accuracy. This model's performance significantly surpasses that of simpler models such as linear regression, which yielded an MAE of 35632.510 and an RMSE of 80153.858. Insights from bias and fairness measurements underscore the effectiveness of advanced models in mitigating bias and delivering equitable predictions across temporal segments. Our analysis revealed varying levels of bias across different models. Linear Regression, Multiple Regression, and GLM exhibited moderate bias, suggesting some systematic errors in predictions. Decision Tree showed slightly higher bias, while Random Forest demonstrated a unique scenario of negative bias, implying systematic underestimation of predictions. However, models like GBM, XGBoost, and LGB displayed biases closer to zero, indicating more accurate predictions with minimal systematic errors. Notably, the XGBoost model demonstrated the lowest bias, with an MAE of -7.548432 (Table 4), reflecting its superior ability to minimize prediction errors across different conditions. Additionally, fairness analysis revealed that XGBoost maintained robust performance in both holiday and non-holiday periods, with an MAE of 84273.385 for holidays and 1757.721 for non-holidays. Insights from the fairness measurements revealed that Linear Regression, Multiple Regression, and GLM showed consistent predictive performance across both subgroups. Meanwhile, Decision Tree performed similarly for holiday predictions but exhibited better accuracy for non-holiday sales, whereas, Random Forest, XGBoost, GBM, and LGB models displayed lower MAE values for the non-holiday subgroup, indicating potential fairness issues in predicting holiday sales. The study also highlights the importance of model selection and the impact of advanced machine learning techniques on achieving high predictive accuracy and fairness. Ensemble methods like Random Forest and GBM also showed strong performance, with Random Forest achieving an MAE of 12238.782 and an RMSE of 19814.965, and GBM achieving an MAE of 10839.822 and an RMSE of 1700.981. This research emphasizes the significance of leveraging sophisticated analytics tools to navigate the complexities of retail operations and drive strategic decision-making. By utilizing advanced machine learning models, retailers can achieve more accurate sales forecasts, ultimately leading to better inventory management and enhanced operational efficiency. The study reaffirms the transformative potential of data-driven approaches in driving business growth and innovation in the retail sector.
  • Thumbnail Image
    Item
    Greater sage‐grouse habitat selection varies across the marginal habitat of its lagging range margin
    (Wiley, 2022-07) Beers, Aidan T.; Frey, Shandra N.
    Studying wildlife–habitat relationships at the edges of their range can provide valuable insights into the environmental factors limiting wildlife distributions and most likely to drive extirpations and range shifts in response to landscape change. Yet the relative impact of those factors is likely different along the range margin, so it is important to identify the limitations to suitable habitat at both regional and local scales. Some of the most drastic impacts of large-scale landscape changes in North America have occurred and are forecasted in the sagebrush steppe ecosystems, where species unable to seek new habitat in the fragmented landscape will be vulnerable to climatic extremes, vegetation community shifts, and anthropogenic land use change. One of the species likely under major threat from landscape changes is the greater sage-grouse (Centrocercus urophasianus), a sagebrush obligate with habitat constraints that make it susceptible to habitat loss impacts as sagebrush systems contract and fragment at their southern range margin, already naturally fragmented. In this study, we evaluated factors of topography and land cover directly impacting habitat selection by sage-grouse in four study areas along their lagging range edge. We used >116,000 GPS locations from >90 grouse across four study areas in southern Utah and Nevada from 2014 to 2020 in habitat selection analyses using random forest models. Our results showed that sage-grouse exploit topography and sagebrush cover, possibly to break predator sight lines and moderate the risk posed by avian predators using tree perches, complicating the effects of tree cover and conifer encroachment into sagebrush habitat. We found similar trends across all four study areas, suggesting sage-grouse along the southern range margin face similar limitations. However, the effects were nonlinear and varied—models trained in one study area were only moderately successful at predicting selection in others. The local idiosyncrasies along this southern range margin indicate a need for place-based conservation for sage-grouse and other potentially imperiled species. Incorporating new understandings of local impacts will refine regional and range-wide models and support efforts to effectively create habitat and plan for range shifts by vulnerable species in response to environmental change.
  • Thumbnail Image
    Item
    Evaluating the importance of wolverine habitat predictors using a machine learning method
    (Oxford University Press, 2021-12) Carroll, Kathleen A.; Hansen, Andrew J.; Inman, Robert M.; Lawrence, Rick L.
    In the conterminous United States, wolverines (Gulo gulo) occupy semi-isolated patches of subalpine habitats at naturally low densities. Determining how to model wolverine habitat, particularly across multiple scales, can contribute greatly to wolverine conservation efforts. We used the machine-learning algorithm random forest to determine how a novel analysis approach compared to the existing literature for future wolverine conservation efforts. We also determined how well a small suite of variables explained wolverine habitat use patterns at the second- and third-order selection scale by sex. We found that the importance of habitat covariates differed slightly by sex and selection scales. Snow water equivalent, distance to high-elevation talus, and latitude-adjusted elevation were the driving selective forces for wolverines across the Greater Yellowstone Ecosystem at both selection orders but performed better at the second order. Overall, our results indicate that wolverine habitat selection is, in large part, broadly explained by high-elevation structural features, and this confirms existing data. Our results suggest that for third-order analyses, additional fine-scale habitat data are necessary.
Copyright (c) 2002-2022, LYRASIS. All rights reserved.