
My mentor and I at the Undergraduate Research Symposium with my poster
Author: Kacey Haws | Major: Computer Science | Semester: Fall 2024
I’m Kacey Haws, and this past year, I had the opportunity to conduct my undergraduate thesis as part of the honors program for my bachelor’s degree in computer science, awarded May 2025. I originally thought about a few different subjects to do my thesis on, including algorithms, game design, and quantum computing. However, last summer, I completed an international internship with Realtra Space Systems Engineering in Dublin, Ireland, which sparked a new passion for my study – aerospace engineering. I connected with Dr. Neelakshi Majumdar from the mechanical engineering department, and met with her in August to begin discussing a path for my thesis in her ASYST (Aerospace Systems Engineering and Transportation) lab.
When I met with Dr. Majumdar, we discussed what I was interested in and how I could translate my background in computer science to research in aerospace engineering. We landed on analyzing unmanned aircraft system (UAS) incidents from the NASA Aviation Safety Reporting System (ASRS) using unsupervised machine learning. The ASRS takes in incident reports from the government, companies, and citizens to address issues in the National Aviation System (NAS). These reports are text-based, which are time-consuming and technically challenging to read through manually. Previous work in the ASYST lab parsed these manually, choosing UAS incidents from January 2013 – August 2023. The goal of my research was to analyze these same narratives through unsupervised machine learning to discern patterns and prevention of common incidents and compare these to the findings from manual analysis.
I began creating my machine learning analyses. I found 57 reports from the aforementioned time frame in the operations including UAS. The reports include information like time, place, flight conditions, and aircraft used, alongside a narrative detailing the incident. I downloaded these reports into one file, and imported just the narratives to a text file to easily use them in my machine learning setups. I refined the text file to cut out words too common for manual study using a basic word cloud generator, which slimmed down the file. I used Google Colab thanks to its free high-powered computing services and common use in training AI models, and used Python-3 to implement my models.
My models for unsupervised machine learning were clustering, topic modeling, and n-grams. Unsupervised machine learning differs from supervised machine learning in that the latter involves training a model with well-labeled data before submitting data for analysis, while the former finds patterns without labeled outcomes. Clustering creates groups based on the patterns it finds in the data. This method is particularly helpful with large datasets and analyzes relationships between the chunks of data. This produces a list of the top ten words for a defined number of clusters. The second method I chose was topic modeling, which generates topics based on the information given. It returns a list of keywords found in the narratives by analyzing how often each word appears in the text. It produced bar graphs for a set number of topics, with the top seven most frequently appearing words in each topic. The last method I chose to implement was n-grams. These produce a continuous sequence of a defined number of words from a given text based on frequency. Bigrams are sequences of two words, trigrams three words, and so on.
My results for clustering included a silhouette score analysis, which determines how well the data is separated and formed, leading to better groups with higher scores and less-defined ones with lower scores. The highest score was 0.039 at 10 clusters. After setting the parameters for ten clusters, I produced a list of the top words for each to infer themes. For example, cluster 6 included words like “app,” “DJI” (an app used to interface between the pilot and the drone), and “b4ufly” (another app showing recreational users where they can and can’t fly) and could be understood to address issues with different flight software. For topic modeling, I presented the top seven words for each determined topic.. Topic 7 included words like “park”,” recreation,” and “controller”, which can be understood to entail incidents with recreational flying operations or incidents in recreational areas. For n-grams, I listed the top 10 most common sequences for each given n. I found 4-grams to be the most useful, offering information most related to differing topics while having a good frequency value. 4-grams were the most useful for analysis, offering insight into issues with software (“the dji fly app,” “the b4ufly app and”), communication (“the ground control station”), and spatial awareness (“visual line of sight,” “too close to the”). I classified the patterns I found in the narratives to compare to the findings of manual analysis, which found that human factors like errors and violations dominated the issues found, with about 25% of them attributed to hardware and software issues. Out of the ten clusters, ten topics, three, four, and five-grams, I found that eighteen of these were attributed to human factors, with seventeen of these attributed to errors like altitude, communication, and shared/restricted airspace, and one attributed to violations (authorization). The other five patterns were attributed to hardware/software malfunctions. I found that unsupervised machine learning through clustering, topic modeling, and n-grams offers an efficient and easily scalable approach to discovering patterns and trends throughout UAS incident reports from the NASA ASRS database. It also complemented manual analysis, producing a similar understanding of these narratives.
My next steps are completing a master’s degree in aerospace engineering sciences at the University of Colorado Boulder. I will focus on a programming-heavy program, with an emphasis on autonomous systems. This will allow me to utilize the machine learning foundation from this thesis and apply the concepts to the aerospace industry. This grant assisted me in my last semester at the university, including preparing for moving and graduate school as a whole. I am deeply thankful to the Honors College for providing me with this grant, and Dr. Majumdar for her support throughout my research.