
Recreating baselines in the our lab
Author: Jonathan Ivey | Major: Data Science and Mathematics | Semester: Spring 2024
Hello, my name is Jonathan Ivey. I am currently a data science and mathematics major working under Dr. Susan Gauch in the Computer Science Department, and in the spring of 2024, I started researching methods to autonomously distinguish diverse perspectives on controversial topics. Much research has been done using machine learning and natural language processing to classify sentences by their sentiment or stance on a particular topic, but an interesting problem is whether people’s perspectives agree for the same reasons. For example, two people may both post online about how they think abortion should be legal, but one may argue that it should be to protect bodily autonomy, while another may say that it is to prevent overpopulation. These two views are very different, and being able to model those types of differences is useful in many applications like understanding qualitative survey results or identifying the pros and cons of online products.
In the fall of 2023, I finished my previous research project in Dr. Gauch’s lab and began reviewing literature related to this new topic. I analyzed current methods, which mostly focus on creating a model that can classify texts in isolation. They don’t consider the greater context and structure of the argument space. Intrigued by that idea, I decided to research ways that graphs could be used to create more structure, new visualizations, and, hopefully, a better model. At the beginning of 2024, I found a paper that had already attempted to do this using what it called synoptical graphs. Their data structure took inspiration from the language arts and created a graph that was meant to mimic the way that we analyze texts. It included edges that represented contextual relations between different statements (e.g., does one statement entail or contradict another statement). This comparative structure allowed the model to transmit relevant information across statements and better capture context in discussions.
I wanted to improve upon their ideas, but before I could do that, I needed to reproduce their models and results, and that process has taken most of this semester. The two main barriers I have faced have been a lack of knowledge about graph neural networks and a lack of thorough documentation on the original paper’s models. Though the lack of documentation is something that can only be overcome through careful testing, my understanding of graphs has exponentially improved this semester. I have used many online resources as well as what I am learning in my algorithms and social network analysis classes to get a clearer view of how the models practically work. I have also received great feedback through the regular meetings that I have with my research mentor and labmates. By listening to their feedback and writing a lot of code, I not only got closer to reproducing the original study’s results, but I also got ideas for how I want to improve the methods with my research.
I am interested in investigating ways to improve the contextual edges of the graph with generative models. I could use them to integrate more in-depth analysis not available in the texts themselves. I am also interested in finding ways to take advantage of the graph structures for visualizing discussion spaces. This summer I will be taking a break from this work to do research at the University of Michigan, but I hope to discuss these ideas with my mentors there and develop them further, so that when I return in the fall, I can begin testing these new ideas.