Abstract
NLP in mental health has been primarily social media focused. Real world
practitioners also have high case loads and often domain specific variables, of
which modern LLMs lack context. We take a dataset made by recruiting 644
participants, including individuals diagnosed with Bipolar Disorder (BD),
Schizophrenia (SZ), and Healthy Controls (HC). Participants undertook tasks
derived from a standardized mental health instrument, and the resulting data
were transcribed and annotated by experts across five clinical variables. This
paper demonstrates the application of contemporary language models in
sequence-to-sequence tasks to enhance mental health research. Specifically, we
illustrate how these models can facilitate the deployment of mental health
instruments, data collection, and data annotation with high accuracy and
scalability. We show that small models are capable of annotation for
domain-specific clinical variables, data collection for mental-health
instruments, and perform better then commercial large models.