
Personality allows us to predict behavior and classify thought processes, so how can we formulate it using modern technology?
Name: Blayton Jones | Majors: Computer Science, Computer Engineering | Semester: Spring 2025
I have always found the intersection between psychology and computer science to be a
fascinating and relatively unexplored frontier of computer science. After learning how computer
science can be used to quantify language and its meaning using natural language processing and
information retrieval, I chose to start my thesis and research under Dr. Susan Gauch’s lab in my
Junior year. An idea had sparked in my mind: was there any way to quantify a writer’s
personality purely from their language usage and sentence structure by using algorithms and
artificial intelligence? It is not like this idea has never been tried before, but much of the research
in this field was either created by computer scientists who did not have a very good grasp on the
accepted methods in the field of psychology, and psychologists who did not have an automatic
method to derive personality. More specifically, a lot of the research done by computer scientists
use what is known as the Myers-Briggs test, which is not actually very accurate as it tends to
shove people into dichotomous boxes. As such, I wanted to approach the research using the Big
Five Inventory, which ranks people based on five different traits on a percentage:
conscientiousness, agreeableness, neuroticism, openness, and extraversion. If we could manage
to find a link between writing styles and these scores, we may be able to automate personality
classification.
My first hurdle was actually tracking down data that would help me find any trends. My
initial thesis and most of this research was based on findings from Dr. James W. Pennebaker
from the University of Texas at Austin. I decided to contact him to see if he had any data I could
use to help train an AI to predict personality, and to my surprise, he actually had several writing
prompts from people with their associated Big Five Personality scores. The prompts were in
response to what is known as a thematic apperception test. These tests show a black and white
picture with an ambiguous setting as a stimulus for the patient to project onto. For example, a
picture of two people talking with red solo cups may be interpreted as them drinking alcohol at a
party, which may suggest a troubled past with alcoholism. The validity for these tests have
consistently been called into question, but they still remain a great way to extract writing from a
user for the purpose of gathering data.
This project uses a clever mix of tools to analyze people’s writing and predict their
personality traits based on it. First, the raw data is cleaned up to remove any malicious or
low-quality responses (either due to spam or short length) leaving a total of 7,290 valid entries.
Then, the remaining texts are processed in three ways: one tool checks for emotional tone by
matching words with emotions like anger or joy, another identifies grammar patterns and even
typos, and a third—using a powerful AI model called BERT—transforms the writing into a
format the computer can deeply understand. These features are all combined and fed into
different types of models, including basic ones like linear regression and advanced ones like
neural networks, to see which can best guess someone’s personality. The goal is to move beyond
surface-level analysis and use the hidden patterns in language—like how emotional, structured,
or polished the writing is—to make accurate predictions. Due to the amount of data and power
required to study it, I used the university’s High Performance Center. Furthermore, my faculty
mentor helped explain the concepts of large language models, such as BERT, to help me better
structure my approach and architecture to tackle the data in a concise and appropriate manner.
Models were evaluated using mean squared error during training and mean absolute error
(MAE) for interpreting results, along with percentage-based thresholds for accuracy. Linear
regression performed best for traits like conscientiousness and neuroticism, while CNNs and
RNNs showed strengths in agreeableness and openness, respectively. Overall, all models
significantly outperformed random guessing, indicating that linguistic patterns can indeed be
used to predict personality traits with moderate accuracy.
So where do most of the faults lie? Obviously there is just not enough information to
extrapolate from. 7,000 entries is a lot, but it is still not very big for a neural network of this size
and the desired accuracy. If I can find more sources of data, the model may become more
accurate. Moreover, there is probably more to be discovered in terms of what we can extract. The
methods used in this paper are pretty basic; enhancing feature engineering—such as
incorporating more nuanced syntactic or semantic cues, word embeddings with domain-specific
adaptations, or emotion profiles that capture shifts in sentiment—can provide the model with
richer contextual information, potentially leading to more accurate predictions. One source of
feature extraction is in other universal features besides POS tagging, such as voice and mood
indicators. Furthermore, I am currently working towards submitting this research to The First
Workshop on Integrating NLP and Psychology to Study Social Interactions (NLPSI).
This suggests that, in theory, we could reverse-engineer personality traits from the words
a person uses. While this might sound ambitious—perhaps even like science fiction—I believe
this is a crucial step toward accurately predicting human behavior on a large scale. If we can
rigorously define a person (their personality) and their environment (external stimuli), we should,
in principle, be able to predict their actions. Of course, the Big Five personality traits alone do
not provide a complete representation of a person, but they are a valuable step toward
formalizing behavior in a computationally meaningful way. I believe that by refining our models
and incorporating more nuanced data, we can move closer to developing large-scale predictive
algorithms for human behavior.