Demand Prediction with Competition Analysis

[ PwC US ] Prasang Gupta, Amitoj Singh, Shaz Hoda

Damage segmentation on a sample image

AIM

The aim of this study was to predict consumer demand using data from competition and infusing it with other datasets that influence demand like mobility and demographic variables. The other goal was to correct a proprietary mobility dataset used by PwC to incorporate for mobility changes brought on by COVID restrictions.

DETAILS

The proprietary dataset mentioned had several flaws including data inconsistencies over months and years and noisy data which was not helping in developing stable models. It also didn’t account for any socio-economic variables based on the demographics of the region. To correct this, the proprietary dataset was merged with demographics variables based on the location and some feature engineering was done to get the mobility in radii of 1km to 5km based on the store location. The final sales value was kept as the dependent field for the model.

Multiple model architectures were tried to model this information to correct the mobility, including ML and DL models. Finally, a voting ensemble method was selected which was modelling the corrected demand with an accuracy of about 70%.

To bump up the numbers further, an aerial snapshot of the region around the score was obtained from Google Satellite API at a tuned zoom level and a segmentation exercise was performed on the image to extract the type of region around the store. This was expected to bring another dimension of immediate demographics in the dataset. This was done by extracting the coverage area of different colors based on pixel masks :

  • Grey (signifying roads, parking lots, streets, etc)
  • Green (signifying agriculture, parks, vegetation, etc)
  • White (signifying roof-tops, buildings, etc)

IMPACT

The final model with all the variables included (engineered visits from proprietary data, Google mobility data, demographics data and derived immediate demographics of the store from satellite images) achieved a test accuracy of 75%. This improved the quality of the mobility data and impacted tens of projects that utilised this mobility dataset helping in building much more accurate and robust models for client deliveries.

Prasang Gupta
Prasang Gupta
Senior Associate, Emerging Technologies

My research interests include distributed robotics, mobile computing and programmable matter.

Related