I was recently required to review a thousand page PDF file. This file contained hundreds of financial summaries for specific topics. The challenge was, and the need for the review, was that just because certain numeric values were permitted by the online system, the updates the users provided might have been incorrect, incomplete or otherwise in need of some adjustments.

For instance, as it was a financial system, during the review one could check for specific business rules or conditions that the online system was not programmed to flag, such as a higher than normal accounts receivable balance as compared to total revenue, or that the gross margins didn’t make sense for that specific type of transaction.

Well, rather than print the document and mark it up with comments and calcuations, or reviewing on-screen and taking copious notes, I wondered how I could make the most of Monarch’s ability to extract data from PDF files.

A few hours later, I had one of the largest, most comprehensive models I’d built in quite some time.

Not All PDF Files are Created Equally

Unfortunately, the challenge that Monarch users often see when using PDF reared its head once more for me. On the one hand, the software used to create the PDF file itself did create a document that Monarch could read, but Monarch didn’t quite create the exact image you’d see if you opened the file with, say, Adobe Reader. It tended to shift the values somewhat within the line. All of this made the exercise of creating the templates, regularly incorporating many of the advanced field properties, some fun in and of itself.

Then I needed to use some of the “slice and dice” techniques to separate four fields that I had no choice but to initially capture as a single field, because of the way the values were shifting within the line, and because of how many occurrences of those values that I treated as a single field appeared differently throughout the report. It just wasn’t consistent at all.

Finally, after much more effort that was initially foreseen, it was time for the real work: to do something useful with all of this extracted data and save myself what would likely be an enormous undertaking.

1 or 0

If you read that heading as “true or false” instead of “one or zero”, congratulations: you know what’s coming next! :-)

Since a major portion of the review was to essentially judge the integrity of the data (do these values make sense?) and to alert for specific trouble spots, all of which were based on numeric data, why not have Monarch perform all of these calculations for me, and let me know where the trouble spots are?

What I determined was that not only did I need to know where the trouble spots were, but I also had to know just how good, or bad, the data set was. Were the users supplying proper values overall, or did we need further training to reinforce our expectations of them? The only way I’d really know that would be to count how many problems I could find. Further, could I use this approach to determine more precisely where our main challenges lied, and where things were basically on track.

Finally, there were some business rules that wouldn’t generate error conditions so much as warnings, like margins were beginning to erode unexpectedly, but the user supplied values were just fine in and of themselves.

To help Monarch point out all of these indicators to me, I determined that I needed a few calculated fields that would return a value of 1 (true) when an error or a warning condition existed, and a zero value (false) when the data was acceptable.

To get started, I built about 20 of these calculated fields to model the various business rules. Then I added two more: one to indicate when any individual error exists within a record, and another to indicate when any warnings exist. Both of these return only a one or a zero value, not the count of the error or warnings.

Now with all of these indicators in place, I could begin some high level analysis.

First I built four simple filters which narrowed the table to show only those records:

  1. with errors,
  2. without errors,
  3. with warnings, and
  4. without warnings.

All of this made the final step both very easy and very powerful.

Putting Monarch’s Aggregations to Work

I concluded the model by building four summaries, each using one of the four filters. Within each of the summaries, I included the appropriate error or warning field as measures. In the case of the “Without” summaries, I use the “Count” aggregation instead of the “Sum” aggregation used in the other summaries.

The summaries included the ability to roll up to a high level department number, and to drill down to each level of business case within the department, grouped by the manager responsible. In this way we could review which managers and overall departments were maintaining the system well, and which needed a bit more guidance.

These few steps saved many hours of meticulous manual, likely error-prone review, just for this single review. If I did nothing else with Monarch all year long, as this is a monthly review process, this savings in time and effort alone would be worth investing in Monarch.

Do you have a similar review process with which you could save time and money, and would demonstrate how you excel with Monarch?