Analyzing dirty data (literature): narrators in Ernest J. Gaines’s story “Just Like a Tree”; also Power BI on-object formatting
On my About page, I claim that “I studied English Literature because it’s fun to find patterns in the unstructured data of a 1,000-page novel.” Nobody ever challenges me on this statement, so I thought I would give an example from my college years. I first read the short story, “Just Like a Tree,” when I was in community college. The story follows a series of ten narrators as people gather to say goodbye to a woman named Aunt Fe. Each section has the name of the narrator as a heading, and no narrator is repeated. One thing that excited me when reading this story was a sense of pattern in how the narrators were arranged. What I noticed right away was that the first and fifth narrators were both young boys, and the sixth and 10th narrators were both old women.
Want to read “Just Like a Tree” for yourself? I was delighted to discover that JSTOR has it online, and offers a generous number of items to read with a free account. JSTOR: “Just Like a Tree.” You can also download my Power BI analysis of “Just Like a Tree” at my GitHub.
Categorizing the metadata
Seeing the way that the first five narrators open and close with a young boy, and that the last five open and close with old women, I made a list not unlike the top left table, listing the character names, their gender, and relative age. I decided to categorize by gender and age. I broke out age as young, adult, old. When I listed the narrators out like this I saw a chiastic structure, similar to that found in texts such as the Hebrew scriptures. I drew lines between Chuckkie & Ben O, Emile & James; Aunt Clo and Aunt Lou, Chris & Etienne. The pattern is ABCBA;DBCBD if you treat the young boys and old women separately or ABCBA;ABCBA is they are categorized together.
The story takes place in a sharecropper’s shack in rural Louisiana. All of the characters are Black except for Anne-Marie Duvall, who attends from a sense of duty to her father and grandfather. The impact of these narrators is to give a 360 degree view of the two subjects of the story: 1. Aunt Fe, whose daughter and son-in-law Leona and James have decided to move her away from her shack to escape bombings of Black family homes, and 2. Emmanuel, a young man who is involved in the Civil Rights movement.
Visualizing the metadata
In the first column, I have the list of narrators by name, showing columns for gender and age. I created a formula to show a different value for adult male, adult female, and a third value for the young and old regardless of gender. I applied this formula on the narrator name as conditional formatting to highlight the chiastic structure. I also used conditional formatting on Gender and Age to quickly show the different values. Beneath that table is a chord diagram showing the relationship between age group (adult or other) and gender. From the chord diagram, you can see that although balanced, the demographics of the narrators skew toward adult and male.
In the second column, I break out narrator by gender as pie chart and as a column chart. 60% male and 40% female. The third column shows breakout by age. This pie chart shows 60% adult, 20% young, and 20% old. The column chart shows 6 adult and 4 other (old or young).
This analysis just scratches the surface of the story. There’s a strong tension in the story between tradition and progress. When I wrote about this story for class, my thesis was that both Aunt Fe and Emmanuel represent tradition, rooted in the past and growing while James and others want to eliminate the past and escape through alcohol or travel. Anne-Marie, Aunt Fe, Aunt Lou are traditional and James is progressive; but Emile envies the progress represented by Chris’s new red tractor. Chuckkie, Emile, and Leola are all related, as are James and Ben O. The first five narrators are all from two families: the family of Emile, Leola, and Chuckkie who want Aunt Fe to stay; and her relatives, James, Leona, and Ben O, who want to move her. This establishes the main conflict of the story.
Dirty data done dirt cheap
A common complaint about data cleaning is that it is not true analysis. To prep raw data into something that you can use with elegant formulas or show in insightful charts, you need to look at the data, think about it, make decisions about how to categorize it. And this is true whether you start with a fictional narrative or with maintenance workorder records. And if you are building an analysis on top of data which someone else got into shape for you, you may be missing information essential to making good decisions.
*
A word about on-object formatting in Power BI
This was my first time trying out on-object formatting, and it felt pretty intuitive. Closing out the format pane felt great, and then I saw I could close out the data pane as well. Fortunately, there are ribbon buttons to get them back.
Conditional formatting is under Cell elements for a table visual on the Format pane. I was impressed to see it come up when I searched for conditional formatting!
On-object formatting is the latest enhancement to the report authoring experience in Power BI. I’m glad to see these improvements to the usability of the product. These changes, along with making row-level security easier to manage and better support for administration and governance reduce friction for reporting teams and make it easier to focus on delivering a great experience for report consumers.