Automated PPT content editor

[ PwC US ] Millennium Bismay, Prasang Gupta, Vishakha Bansal, Shaz Hoda

Alignment on Powerpoint

AIM

The aim of this project was to generate a solution that would auto-format any PPT in PwC-compliant format. The changes included several editorial changes (word alternatives, punctuations) and branding changes (colors, formatting). It also included aligning any misaligned objects present in the PPT.

DETAILS

We decided to edit the underlying XML format of the presentation (using the open office lxml format) in addition to actually editing the PPT itself. We wrote code to parse the PPT in an editable format and then created a modular structure to work on different portions of the PPT. Some modules were rule-based, some were logic driven based on the requirements and some modules incorporated ML solutions developed for sub-problems wherever possible. Some of the modules built were:

Module Type Function and Details
Word Editorial Performed changes on the word level. One was making them consistent in terms of American English / British English. Also included removing any risk words and replace jargons with better suited alternatives
Numbers Editorial Performed date parsing, currency conversions etc based on the format expected. Also changed numerical numbers to text wherever applicable
Punctuation Editorial Added and modified punctuation marks wherever applicable in a consistent format
Paragrapsh Editorial Identified any lengthy paragraphs or capitalisation issues in the presentation and gave suggestions to shorten it
Font Branding Changed font sizes, styles and colors based on the location of the text (header, help box, content box etc)
Bullets Branding Formatted simple and nested bullets to follow a particular pattern and adjusted font size and bullet marker according to the context
Header and Footer Branding Adjusted header and footer in the master slide to be put on every slide in the document. Also, detected and accounted for repeated non-aligned headers and footers
Pictogram Branding Detected images present in the slides and checked whether they are approved pictograms or if they infringe any copyright claims
Colors Branding Changed the colors (font, background, pictograms) to be replaced with the nearest PwC-approved colors based off of a novel color matching technique
Animations Branding Detected and changed any animations in the presentation wherever applicable

IMPACT

This solution built was deployed on an internal hosting service and was made available as a service. The total processing time for an average presentation was about 5 minutes with all the modules active. This brought down the time to manually review and format average presentations from 30 minutes to about 10 minutes. The solution was not perfect, but it helped ensure that most of the repetitive tasks are taken care of by the code and only final inspection with some modifications need to be done by the manual reviewers, saving thousands of manhours for the firm.

Prasang Gupta
Prasang Gupta
Senior Associate, Emerging Technologies

My research interests include distributed robotics, mobile computing and programmable matter.

Related