Ruby Mendenhall Charts Progress Using HPC, Big Data to Flag Unidentified Historical Sources on African American Women’s Lives
Ruby Mendenhall Charts Progress Using HPC, Big Data to Flag Unidentified Historical Sources on African American Women's Lives
Using HPC resources allocated to researchers through XSEDE, Mendenhall and a multidisciplinary team are gleaning lessons on how Black women related to and affected the larger society during periods when their written voices were underrepresented or even illegal.
By Ken Chiacchia, Pittsburgh Supercomputing Center
Information about the lives and experiences of Black women can be gleaned from surprising historical literary sources, Ruby Mendenhall of the University of Illinois Urbana-Champaign said on July 25 in a plenary talk at the PEARC18 conference in Pittsburgh, Pa. Using HPC resources allocated to researchers through XSEDE, Mendenhall and a multidisciplinary team are learning lessons on how Black women related to and affected the larger society during periods when their written voices were underrepresented or even illegal.
"We're using advanced computing to recover Black women's history," said Mendenhall, who is an Associate Professor in Sociology and African American Studies at Urbana-Champaign and newly appointed Assistant Dean for Diversity and Democratization of Innovation at the Carle Illinois College of Medicine. "How is inequality expressed or hidden in the everyday lives of African American women? How do they seek to challenge that inequality?"
The annual Practice and Experience in Advanced Research Computing (PEARC) conference—with the theme Seamless Creativity—stresses key objectives for those who manage, develop and use advanced research computing throughout the U.S. and the world. This year's program offered tutorials, plenary talks, workshops, panels, poster sessions and a visualization showcase.
Mendenhall said she first became aware of the resources available for the social sciences from exposure to the National Center for Supercomputing Applications (NCSA) at Urbana-Champaign, a member of XSEDE, an NSF-funded virtual organization that integrates and coordinates access to advanced cyberinfrastructure. Working with Michael Simeone and other staff at NCSA, she obtained a series of allocations in the XSEDE system that have allowed her group to analyze about 800,000 documents in the JSTOR and HathiTrust databases of documents from 1746 to 2014.
"Our motivation was that often literature by or about Black women was inaccessible or illegal," she said. "A lot of the voices and experiences are either not in the literature early on, or are under-represented."
Mendenhall and her collaborators analyzed the databases with two sets of keywords, those referring to race and those referring to gender. They conducted their analyses within the theoretical framework of standpoint theory, which posits that social and political experiences shape individuals' perspectives and positions. They queried the databases with two computational tools: latent Dirichelt allocation (LDA), a statistical model that infers the collection of topics found in a text; and comparative text mining (CTM), which identifies similarities and differences among topics under which words fall.
Using XSEDE allocations on the former Blacklight, Greenfield and current Bridges systems at Pittsburgh Supercomputing Center (PSC), the group trained their algorithms using a subset of 20,000 texts known to be about Black women, then used those algorithms to identify potentially relevant texts in the larger databases. As a next step they reviewed the metadata from the positive results. Such a review of metadata is called an "intermediate reading." Many of those that passed that step received a traditional "close reading" of the full text by human subject experts to verify the works contained content relating to Black women.
"Our results unfortunately supported the idea of writing as an act of privilege," Mendenhall said. "We often had to go through [writings by] Black men or White women" to glean information about Black women's lives. "You wouldn't think you would read about Black women or their lived experience in some of these works, but when we did a close reading there was information about that."
One intriguing result stemmed from a topic derived in the LDA analysis revolving around court proceedings and property.
"It was unclear whether the property referred to land or to Black women held as slaves," Mendenhall said, but the close readings subsequently confirmed that the result was both valid and corresponded to a known historical period: the "golden age" before 1846 in which enslaved Black women had some success challenging their own status and their children's status via the U.S. legal system.
Their analysis was consistent with this historical period with 575 freedom suits, 60% in which slaves won their freedom.
"We're capturing some of the real experiences of Black women" at a time when it was illegal for them to be literate, she said.
Another, dark phenomenon regarded the use of Black women and their children as subjects in medical studies with incomplete or often absent consent. One article the Urbana-Champaign team identified in the American Journal of Diseases of Childrenin 1918 described the case of an undernourished 5-year-old Black child with chronic diarrhea. While the child's mother was only indirectly referenced in the paper, it identified some provocative insights into her relationship with the medical profession: she didn't or couldn't bring her child in for care for a year, until blood had appeared in the stool; the doctors referred them to a charity hospital, at which they were likely to receive inferior care relative to the hospital where the child had been assessed; and the child's reported diet prior to the symptoms suggested a typical diet for African Americans at that time and place.
Mendenhall said her team will next focus on the current state of Black women in the U.S. in the aftermath of the Great Recession, housing crisis, police shooting controversies, and other factors in what has been called a "new nadir in Black history" by historian Dr. Cha-Jua. The group is recruiting "citizen-scientists" to collect health data in real time and personal reporting via written or online journals to examine how gun violence affects public life and public health. Her team is hoping to collaborate with higi, a consumer health data tracking service with access to more than 217 million health measurements from over 6.9 million account holders at 11,000 centers around the U.S.
"We're asking how we can use cyberinfrastructure to capture unheard stories about violence," she said, stressing the importance of investigating correlations between violence and Black maternal and infant mortality, diabetes, cancer and other medical problems.
Mendenhall sees the new research as an integral part of her new appointment at the College of Medicine. "We want to see the community at the table" in charting a course for medical research at the college in which the flow of information moves in both directions. In addition to helping design studies that engage and earn community support, "I'm hoping that community members will come forward with health issues that they would like solved."