Examining Chrysler Closures With Monarch

by Sandy on May 26, 2009

in Calculated Fields,Summaries

Auto maker Chrysler recently announced their intent to close 789 dealerships (PDF) in the US as part of their recovery plan.

While it’s unfortunate, and will clearly have an impact on individuals and local economies, we can use their PDF list as an example of what can be done with Monarch Pro in terms of:

  1. extracting data from a PDF file,
  2. automatically handling address data, and
  3. freeing the data from the constraints of the report in which it appears to derive further insights

Additionally, with Monarch Pro we can integrate some data from another source that may provide some background into how the Chrysler management team made decisions.

Handling the PDF

The first page is just a cover, so view the second page and “Auto Adjust” to convert the PDF to text and proceed with creating a detail template.

Select a single line for the template sample, including the dealer name and number. Don’t worry that all of the records appear on more than a single line. I used four numeric traps to capture the dealer numbers (not all of them are five digits).

Paint the fields as follows:

  • Dealer Name: 31 characters – advanced properties: End Field On End of Left Justification
  • Majority Owner: 23 characters – advanced properties: End Field On End of Left Justification
  • Dealer Address: 38 characters – advanced properties: End Field On Blank Field Values: 1 (the record for Massey-Yardley Inc. doesn’t behave with End of Left Justification)
  • Dealer Code: 5 digits
  • Lines: 4 characters

Save your template, and remember to use the Verify feature to make sure that you haven’t painted the fields improperly.

That’s it. We’ve built the basics of extracting the data. Now we’ll have Monarch some more advanced work for us.

Separate the address data

In the Table window, use Monarch’s Address Block feature to automatically isolate regular address lines, city and state names, and zip codes. You’ll need the first three address lines, as well as City, Region, Postal Code and remember to always use the Error Code when working with Address Blocks. If Monarch thinks that it couldn’t handle an address properly, it’ll alert you by supplying a non-zero error code.

Count the product lines

To isolate which products will be affected by the closures, we need to work with the Lines field. The footer within the PDF instructs us that C is for Chrysler, J is Jeep, D is Dodge, and T is Dodge Truck. To track the individual lines, we need four calculated fields.

The report displays multiple combinations of character sequences when multiple letters are used in the Lines field, so we can’t count on a particular character appearing in the same column position. So how do can we tell which lines are affected at a particular location? We’ll use the Instr function to test if any of the specific characters appear anywhere within the field. Instr() returns a zero if the character does not appear, and returns the position of the character within the field if it is included in the field. So our new numeric fields become:

  • Chrysler_Line: if(instr(“C”,Lines)>0,1,0)
  • Dodge_Line: if(instr(“D”,Lines)>0,1,0)
  • Jeep_Line: if(instr(“J”,Lines)>0,1,0)
  • Truck_Line: if(instr(“T”,Lines)>0,1,0)

Finally, we may want to review the average number of lines affected by geography, so for a “Number of Lines” field, we can either add the four line counts, or, knowing that the line count is already reflected in the Lines field, we can use the length of the field to indicate the line count. Simply use

Len(Lines)

as the formula for a Number of Lines field.

Now we’ve completely freed the data elements from the constraints of the report, and we can examine the impacts a little more closely.

Build summaries to analyze data

Let’s determine which states are likely to be hit the hardest. Create a new summary named State, and use the state abbreviation as the only key field. Add count (a default field Monarch creates for summary purposes, a second instance of count (we’ll come back to it), and each of the product line counts.

Double click the key field, and set the sort order to be based on the count measure, descending.

Double click the second count field instance, and on the Calculation tab, choose the Percent of Total option, and set the field to display as a two decimal percentage. Double click each of the line counts in turn, setting an upper limit of 10 for the state count. Display the values in a bold red font. This will format all the values greater than or equal to 10 so that they stand out. Click OK to save the summary definition.

It looks like Pennsylvania, Texas, Ohio and Illinois will be the most affected, with 53 closures in Pennsylvania alone (6.72% of the total closures). More Truck and Jeep lines will be closed in Texas than any other state. The Jeep line gets hit the hardest, with 531 total closures, while the Dodge line takes the fewest, at 430. That’s still a significant number, though.

Let’s get a little more granular and build a city-based summary. First, to avoid grouping any potential identical city names in different states, we’ll build a new field in the Table. “City State” gets this formula:

DlrCity+", "+DlrRegion

I used the Dlr prefix for my address block. This formula ensures that we get two entries for Burlington; one in Iowa and one in Washington.

Now create a new summary named City. Our new City State field is the sole key, and again add the count and a Percent of Total count as the measures. Overall, we see that there’s a pretty good distribution of closures at the city level, though Elyria, Ohio will see eight dealerships close.

Creating a summary using the Number of Lines field as the sole key field and count as the measure reveals that most of the closing dealerships will have two product lines affected.

Monarch transforms static data into dynamic information

In just a few minutes and with minimal effort, we used Monarch Pro to convert a pretty standard list into a real information source. In the next post, we’ll look at how we can integrate what we’ve derived here with some additional data from another source to see what other insights might be available to us.

Building even concise summaries, which take advantage of just few calculated fields, can quickly provide practical tools which assist in understanding the information available, and showcase your ability to excel with Monarch.

{ 4 comments… read them below or add one }

mrzer0 May 31, 2009 at 12:39 pm

I just wanted to thank you for the effort that is so evident in this posting. I found it to be most instructive as I have with so many of your posts on this excel-ent site.

I am looking forward to your next post.

mrzer0 May 31, 2009 at 1:56 pm

FWIW, I suggest that the following enhancement might be worthwhile as an example of another analysis technique.

Add …
Chrysler_Line_2: if(instr(“C”,Lines)>0,1,0)
Dodge_Line_2: if(instr(“D”,Lines)>0,2,0)
Jeep_Line_2: if(instr(“J”,Lines)>0,4,0)
Truck_Line_2: if(instr(“T”,Lines)>0,8,0)

These additional calculations allow is an interesting analysis extension effect represented by the value

Lines_Sum = Chrysler_Line_2 + Dodge_Line_2 + Jeep_Line_2 + Truck_Line_2.

This calculation yields a number which will be found in the range of 1 – 15. Each increment of Lines_Sum value represents a unique combination of Vehicle Lines closed *possible* in the data.

Thus … IF Lines_Sum = 6 then the lines closed can only Dodge and Jeep and no other combination. (The more anal-retentive readers (Does anal-retentive have a Hyphen??? (grin!)) will have noticed that the values assigned by the suggested additional calculated fields are powers of 2.)

Further, the absence of a value for Lines_Sum in the range of 1 – 15 in summary data means that the represented condition is NOT found in the data. For example, there are no 6 values (DJ only) of Sum_Lines and only two 9′s (CT only) found in the 789 closed dealerships. Also, multiple specific values for Lines_Sum could be selected to represent underlying combination condition. Dealerships having odd values of Lines_Sum must have the Chrysler line, etc.

I present this only for consideration as another tool for the analysis tool belt. Sometimes this technique can yield some interesting insights into data with which one is unfamiliar and which may contain hidden patterns that might have some significance.

Sandy May 31, 2009 at 2:44 pm

Thanks! I’m glad that you’re enjoying the site.

Calculations such as what you’ve suggested really do assist in finding commonality and outliers where combinations exist. Nice work! Techniques such as these not only work very well with this particular case, but are also readily portable to other challenges. Thanks for taking the time to share your thoughts.

Chickenman July 3, 2009 at 7:29 am

Echoing the other poster, this is an Excel-lent illustration of fast and powerful data analysis!
Nice job.

BTW – Foulke Management Corp has a 4-digit zip and so faked out the address block and gave a blank region. Obviously, in practice one could fiddle with some more calc fields to parse the state.

Leave a Comment

*

Previous post:

Next post:

Copyright © 2007 - 2012 Excel with Monarch Training and Services. All rights reserved. Privacy Policy
Microsoft Excel™ Microsoft Corporation. Monarch and Monarch Pro™ Datawatch Corporation.