Abstract
Healthcare is one the fastest growing fields for conversational agents (CAs), also known as chatbots. Compared with traditional human-machine interfaces which focus on utilitarian features, chatbots offer unique advantage in understanding the user’s intent and acquiring critical information from the user via natural communications. More importantly, the perceived impartial and non-judgmental nature of machines may facilitate users’ honest self-disclosure when answering embarrassing or stigmatizing questions in medical interviews. To realize the potential of CAs, it is crucial to understand the content in users’ conversation such as self-disclosure of sensitive, personal information so that the agent can react appropriately by offering empathy or encouraging elaboration. In this exploratory study, we designed a web-based chatbot that automatically conducted medical interviews on colorectal health. The interview consisted of embarrassing questions about stool description, diarrhea, constipation, and anal sex behavior that participants needed to answer. Interview conversations from 552 participants above the age of 35 were recorded. The participants’ responses were then analyzed in terms of the level of detail and whether the response contained health or personal information that participants voluntarily disclosed. We tested the fine-tuned BERT model and the GPT-3 classification API to compare the capability of pre-trained deep language models for classifying medical interview conversations.