Conceptual and simulation models can function as useful pedagogical tools, however it is important to categorize different outcomes when evaluating them in order to more meaningfully interpret results. VERA is a ecology-based conceptual modeling software that enables users to simulate interactions between biotics and abiotics in an ecosystem, allowing users to form and then verify hypothesis through observing a time series of the species populations. In this paper, we classify this time series into common patterns found in the domain of ecological modeling through two methods, hierarchical clustering and curve fitting, illustrating a general methodology for showing content validity when combining different pedagogical tools. When applied to a diverse sample of 263 models containing 971 time series collected from three different VERA user categories: a Georgia Tech (GATECH), North Georgia Technical College (NGTC), and “Self Directed Learners”, results showed agreement between both classification methods on 89.38% of the sample curves in the test set. This serves as a good indication that our methodology for determining content validity was successful.
2022
Ph.D. Thesis
Development of Trustworthy Image Classification Systems Within a Sociotechnical Context
Growing cases of disparate outcomes due to Machine Learning (ML)-based systems have propelled notions of trustworthiness as a pressing concern rather than an afterthought. Although there is considerable debate about whether facial processing technology (FPT) fuels criminal justice disparities or can be used as a tool to ameliorate them, I investigate if the responsible design, development, evaluation and interpretation of FPT can help address racial inequalities in the Miami-Dade County (Florida, U.S.) criminal justice system. Based on several interdisciplinary research questions, my research methodology proposes experimentation-based methods to address various fairness and bias issues within end-to-end deep learning image classification, I design an equitable methodology for generating and interpreting racial categories using mugshots from the Miami-Dade County criminal justice system. By considering race as a multidimensional construct, I assess the performance of eight deep CNN-based architectures when classifying mugshots according to binary race, and four race/ethnicity categories. By proposing a rigorous “self-auditing” model evaluation strategy, I provide empirical support for improving the disaggregated evaluation when predicting Black mugshots by 0.22% to 34.27% compared to White mugshots. Lastly, by implementing “post-hoc” gradient-based saliency maps, I assess the consistency of facial region relevance to a model when generating a racial prediction and make cautionary arguments for the use of a deep learning approach in a high-stake decision-making domain in an effort to foster greater ML trustworthiness within criminal justice stakeholders.
AI & Society
Detecting Racial Inequalities in Criminal Justice: Towards an Equitable Deep Learning Approach for Generating and Interpreting Racial Categories using Mugshots
Rahul K. Dass, Nick Petersen, Marisa Omori, and 2 more authors
AI & Society (Springer): Special Issue on AI for People, , 2022
Recent events have highlighted large-scale systemic racial disparities in U.S. criminal justice based on race and other demographic characteristics. Although criminological datasets are used to study and document the extent of such disparities, they often lack key information, including arrestees’ racial identification. As AI technologies are increasingly used by criminal justice agencies to make predictions about outcomes in bail, policing, and other decision-making, a growing literature suggests that the current implementation of these systems may perpetuate racial inequalities. In this paper, we argue that AI technologies should be investigated to understand how they recognize racial categories and whether they can be harnessed to fill missing race data. By bridging this gap, we can work towards better understanding racial inequalities in a wide range of contexts, most notably criminal justice. Using a multidisciplinary perspective, we rethink the design and methodology used in facial processing technology (FPT) based on supervised deep learning model (DLM) image classification. By modifying standard FPT pipelines to tackle multiple sources of DLM bias, we propose an experimental methodology based on ethical AI principles to generate binary (Black and White) racial categories using mugshots. We go beyond simply reporting DLM accuracies and address fundamental issues such as generalizability and interpretability by using a “self-auditing” approach. First, we evaluate the inference performances of 42 fine-tuned DLMs using unseen test images from the same dataset but subject to varying data augmentations. Next, to interpret and validate our methodological approach, we apply gradient-based saliency maps to assess the consistency of facial region relevance and attribution. Finally, drawing upon insights from three areas (computer science, sociology, and law), we investigate the efficacy of our DLM-based method as a tool for detecting racial inequalities in criminal justice.
2020
CRV
It’s Not Just Black and White: Classifying Defendant Mugshots Based on the Multidimensionality of Race and Ethnicity
Rahul K. Dass, Nick Petersen, Ubbo Visser, and 1 more author
In 17th Conference on Computer and Robot Vision, , 2020
Analyses of existing public face datasets have shown that deep learning models (DLMs) grapple with racial and gender biases, raising concerns about algorithmic fairness in facial processing technologies (FPTs). Because these datasets are often comprised of celebrities, politicians, and mainly white faces, increased reliance on more diverse face databases has been proposed. However, techniques for generating more representative datasets are underdeveloped. To address this gap, we use the case of defendant mugshots from Miami-Dade County’s (Florida, U.S.) criminal justice system to develop a novel technique for generating multidimensional race-ethnicity classifications for four groups: Black Hispanic, White Hispanic, Black non-Hispanic, and White non-Hispanic. We perform a series of experiments by fine-tuning seven DLMs using a full sample of mugshots (194,393) with race-ethnicity annotations from court records and a random stratified subsample of mugshots (13,927) annotated by a group of research assistants. Our methodology considers race as a multidimensional feature particularly for a more diverse face dataset and uses an averaged (consensus-based) approach to achieve a 74.84% accuracy rate based on annotated data representing only 2% of the full dataset. Our approach can be used to make DLM based FPTs more inclusive of the various subcategories of race and ethnicity as they are being increasingly adopted by various organizations including the criminal justice system.