Supervised by: Ujjayanta Bhaumik MSC, B.Tech (Hons). Ujjayanta Bhaumik is currently pursuing his PhD in Computer Science and Physics with a focus on Virtual Reality in the Light and Lighting Lab, KU Leuven. He worked as a creative software developer for over 2 years. He has a masters in Computer Graphics, Vision and Imaging from University College London. 

1.1 Abstract 

Artificial Intelligence chatbots for depression are computer programs designed to mimic human conversation in order to provide support and advice to individuals suffering from depression. These chatbots are trained on large amounts of data and use natural language processing techniques to understand and respond to user inputs. The use of AI chatbots for depression has the potential to greatly increase access to mental health support, as they are available 24/7 and can provide immediate assistance. In addition, the use of chatbots can also reduce technological risks and provide a confidential and convenient way for individuals to discuss their feelings. Despite these benefits, there are concerns about the reliability and effectiveness of AI chatbots for depression, and further research is needed to determine their efficacy. The main objective of this research is to evaluate the effectiveness of the chatbots in providing a supplemental treatment for depression. The results of our research showed that the bots provided very accurate treatment, mainly following a plan for Cognitive Behaviour Therapy (CBT). Moreover, they were all developed with unique and friendly personalities and are very engaging. 2 out of 3 Chatbots also had Voice Chat functions available, which heightened personal engagement. Nevertheless, there is improvement needed when considering the effectiveness of these conversations. Chatbots lack flexibility. They will often change topics abruptly, disregard previous information and restrict the answers one can give, which limits their ability to provide emotional support.

 

Introduction

2.1.1 Chatbots and mental illnesses

There are growing appeals for chatbots in multiple industries. The use of bots as brand communication channels has increased by 92% since 2019 [1]. With such high demand, it is only to be expected that this technology is experiencing revolutionary progress. Their capacity to process language and produce human-like responses stands at an all-time high. Chatbots have been increasingly used in the recent year to help people of all ages with mental health issues, the most prominent being major depressive disorder (MDD), also known as depression. MDD is a serious mental health condition that affects millions of people worldwide, causing feelings of sadness, hopelessness, and loss of interest in everyday activities [2]. Traditional forms of medical help, such as therapy and medication, can be effective for some people, but for others, treatment may not be as accessible. 75% of people in low- and middle-income countries receive no treatment [3], and effectiveness may vary depending on the individual. This is where chatbots can play a crucial role in providing support and assistance to individuals struggling with depression. 

 

2.1.2 Chatbots and AI

Artificial intelligence (AI) chatbots for depression work by using natural language processing (NLP). Equipped with this artificial brain, a complex set of advanced algorithms allows NLP bots  to not only follow commands, but also to understand human language, improve over time, figure out intent, and contextualise messages. This makes them particularly useful for providing users with information on MDD, its causes, symptoms and treatments. They can also provide patients with coping strategies, offer emotional support, and help them track mood and progress. The machine can also act as a form of self-assessment. A user can answer a series of questions, providing a better understanding of their mental state and, potentially, when they might need to seek professional help.

 

2.1.3 Features and Limitations of the Chatbots 

Due to the extensive functionalities of chatbots, it is easy to overlook the most practical factor: their accessibility. People can use them at any time, from anywhere, without the need to make an appointment or wait for a response. This convenience can be especially helpful for patients who may feel ashamed or embarrassed about seeking help, or for those who live in areas where mental health resources are limited. It is important to note that AI chatbots should not be used as a substitute for professional mental health care. While these chatbots can provide support and information, they are not trained therapists and should not be used to diagnose or treat mental health conditions. They can, however, be used as a supplement to traditional forms of treatment, providing a supportive and accessible resource for individuals seeking treatment for MDD. 

The main focus of this research will be evaluating the effectiveness of three AI chatbots in providing a supplement treatment for major depression. According to a recent study, mental health chatbots can successfully engage people who are depressed in sympathetic dialogues and help treat their symptoms. Three different chatbot platforms that address mental health issues and depression will be compared in this research. ‘Weobot’ [4], an artificial conversational agent (chatbot) that helps you monitor mood and learn about yourself, the ‘Wysa’ [5] app where you can converse with a penguin chatbot about anything that’s on your mind, and the AI chatbot app ‘Youper’ where users may be assisted and detect, track, and process their thoughts and feelings [6]

 

Analysis

3.1.0 Aim of Research

Life-threatening illnesses that loomed in the 19th and 20th centuries including tuberculosis, smallpox, and typhus fever have since become fully treatable or eradicated in the western world [7]. Instead, the silent killers of the 21st century are mental illnesses that are left untreated, especially in isolated world regions with a lack of popular access to psychology and psychiatric treatment schemes. Research carried out by the John Hopkins Mental Health Disorders Department showed that depression is the main link to suicide and estimated that most of the 4,000 people who commit suicide in the US each year suffer from untreated depression or a diagnosable mental disorder [8, 9]

Due to the growing use of online chatbots to cope with mental illnesses, it is important to test the effectiveness of the different chatbots in providing a supplementary service. Government healthcare schemes and college research centres have developed online solutions to widen the access to psychology and psychiatric treatments for mental illnesses. Among the online solutions are the chatbots that can be used to undertake psychology and psychiatric sessions using AI algorithms, acting as a supplement for those seeking help with mental health support. The aim of this research is to evaluate the effectiveness of the service provided by the chatbots by using a psychiatry case file on depression when interacting with the chatbots.

 

3.2.0 Overview of Method

Firstly, a comparison framework has been designed in order to equally evaluate the service given by the three chatbots. In the comparison framework, criteria and sub criteria have been added. The three main criteria upon which the chatbots are assessed are: the quality of information, the software capacity, the humanity, and the accessibility of the chatbots. Detailed descriptions of each criterion and sub criterion used can be found in the Comparison Framework for Chatbot Interaction (2.3.0). In the evaluation, the performance of the chabots in each criterion will be judged numerically. Scores will then be added to work out the percentage of criteria that each chatbot completed in order to determine their effectiveness.

Secondly, a psychiatric history case file was designed in order to ensure that all of the interactions between the chatbots were controlled by referring to the same person, age, symptoms, and any other data that the chatbot asked for. The model used to create the brief psychiatric case file was adapted from the University of Auckland’s Guide to Writing a Psychiatric Case Study [10]. The case file was designed by using a random country, name, and age generator. When the research was carried out, all of the interactions with the chatbots referred to the case file. Moreover, the medical symptoms and issues outlined in the case file were taken from the Common Symptoms of Major Depression by Mayo Clinic as well as the Healthline Guide to Helping Someone with Major Depression [11, 12].

 

3.2.1 Psychiatric History Case File (Adapted from [10])

Introductory statement

This case outlines the typical diagnosis given by a person with a first presentation of major depressive disorder (MDD) symptoms.

1. Demography

Name: John Doe; Gender: Male; Age: 39; Birthplace: Gothenburg, Sweden; Residence: Birmingham, England; Occupation: Social worker; Number of children: 1; Marital status: Divorced.

2. Mode of referral and history of persisting complaint

Mr Doe was referred to the psychiatry service for a week by the cardiology team at the Queen Elizabeth Birmingham Hospital who were concerned about his high cardiac rates at normal stances. He is speaking to the chatbots in order to determine whether he is depressed. 

Mr Doe describes frequent feelings of sadness, tearfulness, emptiness or hopelessness. Relatives have complained about his angry outbursts, irritability or frustration, even over small matters at family gatherings. Has had loss of interest or pleasure in most or all normal activities, such as going out to dinner and the cinema [11]. He suffers from sleep disturbances, including insomnia or difficulty in waking up. Mr Doe has emphasised his feelings of worthlessness, fixating on past failures or self-blame for the divorce and shared custody of his son [12]. No past forensic or psychiatric history.

 

Comparison Framework for Chatbot Interaction

The framework used in this research was originally developed by Elahe Paikari and André van der Hoek [13]. We adapted their original design and made slight alterations to help us compare the use of chatbots for a supplemental MDD treatment. This led to the creation of four criteria: quality of information, humanity, software, and accessibility; and relevant sub-criteria which the chatbot will be assessed on. 

 

3.3.1 Quality of Information

Since AI chatbots respond based on previous data and messages [14], there are many factors to evaluate in order to determine the quality of the information provided by the bot.

  • Effective dialogue – Context facilitates a conversational experience. The bot should be able to manage data and hold on to the information received.
  • Medically accurate – Information provided by the chatbot will be compared with facts given by the American Psychiatric Association [15]
  • Refers for professional treatment – The bot advises the user to seek professional treatment from a therapist/psychiatrist. 
  • Refers to accurate treatment – The chatbot is referring to treatment specifically for depression and does not provide general suggestions. 
  • Approved by the FDA – The Food and Drug Administration Breakthrough Device Designation [16] was given to the chatbot.

 

3.3.2 Humanity

A study conducted by the University of Florida showed that people need ‘humanness’ in order to trust in something [17]. For chatbots in mental health, this is especially important. In order to efficiently evaluate ‘Humanity’ we looked for the following five aspects. 

  • Personality – Human-like traits and attitudes can be seen throughout the interactions. Conversations with the bots do not feel bland. Extraversion and Agreeableness, as defined by the five-factor model of personality [18], are the traits most sought after.
  • Engagement – The less tedious it is to have an exchange with the chatbot, the more engaging it will be considered.  
  • Themed discussion – The change of topics occurs after an acceptable period of time and feels organic.
  • Online customer satisfaction – User ratings are reviewed. The final score is based on an estimate.
  • Attention estimation – The software successfully retains the user’s attention.

 

3.3.3 Software

AI Chatbots work by performing routine tasks based on specific algorithms like Natural Language Processing (NLP) which attempt to mimic a human conversation [19]. The types of chatbot and algorithms used directly impact the effectiveness of the chatbots at replicating human conversations, learning, and achieving their objectives. We are evaluating how suitable the software is against the following aspects: 

  • Chatbot response volume – Used as an indicator to find the number of questions that the chatbot has effectively answered out of all the questions asked [20].
  • Word error rate (WER) – Ratio of errors in a transcript to the total words produced by the chatbot [21]
  • Task completion rate – Measure the success rate of an action performed through the chatbot, for example providing a session analysis. 
  • Interaction rate – Average number of messages exchanged in the conversation, can act as an indicator of specificity and engagement of the chatbot. 
  • Evolving – Measures the ability of the chatbot at retaining information and asking specific questions based on the data collected [22].
  • Alternate vocabulary – Observes if the chatbot repeats phrases or if the conversation is varied, effective at determining the ability of the algorithm to carry on varied conversations. 

 

3.3.4 Accessibility

Due to the current shortage of mental health providers and elevated prices on therapy sessions, only 16.5% people affected by MDD globally receive adequate treatment [23]. Therefore, as previously discussed, accessibility is a major point to consider when evaluating chatbots.

  • Languages – The chatbot is able to communicate with users from different countries in their native languages.
  • Communication channels – Not only does the bot interact with patients through text, but voice chat is also available.
  • Cost – Fees are affordable or non-existent. In cases of reviewing a paid chatbot only a limited trial version will be evaluated.

 

Evaluation of Chatbots

Table 1: Criteria and Sub-Criteria of comparison Framework for Chatbot Interaction in Woebot, Youper, and Wysa

Woebot Wysa Youper
Effective dialogue 7/10 6/10 7/10
Medically accurate  7/10 9/10 9/10
Quality of information Refers professional treatment * 10/10 10/10 5/10
Refers to accurate treatment* 9/10 9/10 9/10
Approved by the FDA 10/10 10/10 0/10
Personality 9/10 10/10 10/10
Engagement 9/10 7/10 9/10
Themed discussion 9/10 5/10 9/10
Humanity Online customer satisfaction 9/10 9/10 8/10
Attention estimation 8/10 9/10 9/10
Response volume 8/10 7/10 10/10
Word Error Rate 8/10 8/10 9/10
Task Completion Rate 9/10 9/10 9/10
Software Specific questions 10/10 7/10 6/10
Alternate vocabulary 10/10 5/10 8/10
Evolving 9/10 5/10 6/10
Languages 9/10 3/10 9/10
Accessibility Communication channels 9/10 2/10 8/10
Cost 9/10 9/10 5/10
Average score: 8,84/10 7,32/10 7,63/10

 

Woebot

Woebot is designed to help patients with depression. It is a valuable tool for providing support and guidance to individuals who are struggling with this mental health condition. The benefits found in Woebot were:

Benefits:

  • Accessibility: Woebot is available 24/7 and can be accessed from anywhere with a suitable internet connection, making it an accessible option for individuals who may not have access to traditional mental health services or who may not feel comfortable seeking help in-person.

 

Figure 1: AI showing great quality of information [24]

Chatbot example
  • Software: Woebot has an impressive response speed which is continuous throughout the conversation. Woebot asked general questions at the start and those questions became more bespoke when more information was provided. Towards the end the responses appeared very ‘tailor made’, which meant that the chatbot was evolving well.

 

  Figure 2: Great software used [24]

Chatbot example
  • Humanity: Woebot was extremely engaged throughout the conversion and maintained this engagement consistently. Its personality was not monotone at all and it was very similar to talking to an actual human being. This helps patients to feel more comfortable when discussing symptoms and feelings with the chatbot.

 

Figure 3: AI showing humanist values through engagement [24]

Chatbot example
  • Quality Of Information [24]: Woebot has very efficient dialogue, asking questions very similar to a mental health professional. Woebot also has mood tracking which is taking constant updates about how the patient is feeling and it can decide whether it would be suitable to ask for professional medical help.

 

Drawbacks:

  • Risk of misdiagnosis: Woebot is extremely capable of providing a diagnosis for the depression being suffered, since its questions are very bespoke and should lead to identifying the correct type of support for the patient. However, there is always the chance that Woebot has misunderstood what the patient is actually going through. This could potentially lead to inappropriate treatment recommendations. Therefore, if the chatbot is having some difficulty with diagnosis it may be better to seek professional, human help. 

 

Overall, Woebot was designed very well to treat patients with depression. It has been proven to be very useful in providing support and guidance, particularly for individuals who may not have access to traditional mental health services or who may feel more comfortable seeking help online. However, it is vital to understand that there are some limitations, such as Woebot providing information or a diagnosis which could be misleading, which, although highly unlikely to occur, must be taken into account.

 

Youper

The free ‘Youper’ app was downloaded in order to communicate with the chatbot, which is a convenient and well-organised approach to providing mental support. According to a recent Compuware survey, apps are preferred by consumers (85%) over mobile websites because they are perceived as being more convenient, quicker, and easier to navigate [26].

The process: The responses to the first questions asked by the app were based on the information from the fictional character we have created, John Doe, who is trying to determine whether he has depression and whether he can improve his condition. These questions included: why the user came across the app, questions in a CBT exercise intended to enable them to immediately shift their inner state, and questions about present emotions and the reasons for them, etc. 

In the responses, it was stated that John Doe had recently been experiencing anxiety, helplessness, and depression. Moreover, the chatbot was informed that he had difficulty controlling his worries and he commented on his lack of interest in or enjoyment from doing regular activities. He frequently experienced panic attacks and feelings of shame, which made him avoid interacting with people and engaging in activities that put him in the centre. His mood swings have had a significant impact on his relationships and quality of life, and he has powerful negative emotions and traumatic memories. A user account was created using John’s identity. The chatbot asked immediately about his current state of mood, which was followed by a talk with a chatbot available 24/7. The user was asked to provide a percentage measurement of how depressed he felt. The user selected 75% depressed. It was asserted that John’s work, lack of sleep, health, and self-image were all contributing factors to this sensation. A conversation was started with the bot, which asked for details regarding the precise reasons for John’s sadness. As the details generated for John were entered, the chatbot responded right away, stating that it was “here to help you work through those feelings and experiences, and discover a healthier way to cope with them.” After that, no further progress was made since the app required subscription and payment in exchange for trying out another activity. After rejecting the offer, the many features of the app and services it offered were investigated. 

 

  • Quality of information: 

The app’s assumptions in some of the answers it generated were similar to those demonstrated and asserted by professionals in a variety of contexts and ways. For instance, Clare Macdonald of Creative Sleep, founder of the ‘Help For Sleep’ programme and previous insomniac, claimed in her ‘Bedfolk’ article that her all-time favourite sleeping technique is ‘Tibetan Dream Yogas’, where ‘our imagination has the freedom to explore, to revisit, and to rewrite alongside a steady rhythmic breath that takes us ‘there’ – a dreamlike space’ [25]. Furthermore, Youper works with trusted healthcare organisations, including MHA – Mental Health America, LegitScript, Quest Diagnostics, Honeybee and Athenahealth [27]. Despite all of that, the FDA never verified Youper, so the score for that section was significantly decreased.

 

Figure 4

Chatbot example
  • Accessibility:

The program is quite easy to use. The document is organised and has five pages: ‘Check-in’, ‘Explore’, ‘Progress’, ‘Insights’, and ‘Me’. The user also receives a brief explanation of the app when they initially launched it.

The app is free, but the user may subscribe for $70 per year to have full access to guided CBT activities that have been shown to be effective and other services [6]. Several exercises and activities are inaccessible to unregistered users, but the app does offer the option of ‘checking in’ at any time, where the user may describe how they are feeling right now and whether or not the app’s answers are helpful. There are still some useful and free activities that are focused on a certain subject and allow the user to listen to audios, communicate with chat, and receive assistance.

 

  • Humanity: 

Before the user can talk to the bot, they are asked about their feelings and what is causing them. The chatbot keeps reminding them that it is there to assist them, has human-like characteristics and attitudes, and makes communicating with it a pleasant experience. Every dialogue in the available exercises has a topic, and there are many different options of feelings the user can express. 

In addition, Youper provided a wealth of user feedback on their website, and despite mentioning drawbacks and areas for improvement, most of their consumers were satisfied.

 

Figure 5: User ratings of Youper

Chatbot example

Figure 6

Chatbot example
  • Software:

The chatbot responds quickly. The responses are straightforward and professional, and they have no errors. The bot’s analytical abilities are strong, and it provides the users with useful exercises and advice on how to handle their problems. The chatbot doesn’t often repeat its sentences. On the other hand, while the responses are relevant to the questions and what the user has written, they are not very detailed. The conversations are brief and come to a stop right away when the chatbot refers the user to an exercise, which might be unavailable if the user isn’t paying and registered to the app.

 

Wysa

Wysa is an app for mobile phones designed to help with mental health. Its functionalities vary between tracking the mood of the user, creating a safe space, and encouraging the patient to reflect upon the causes of their feelings. The app’s main appeal is its mascot, a penguin, which talks to you. The final score given to this chatbot was 7.32/10. Its main strengths and weaknesses are the following:

  • Quality of information:

Almost immediately after opening the app, Wysa encourages the user to work with a therapist (figure 7). During the talk sessions responses are limited and Wysa prompts the user to talk about and express their feelings. The offered advice is limited, but powerful. Mostly, the treatment focuses on breaking patterns (CBT), which is an effective way to deal with MDD [15]

 

Figure 7

Chatbot example

Figure 8

Chatbot example
  • Humanity:

Wysa was designed with a unique personality. It shows quirky traits and even goes as far as trying to amuse the user through hand drawn pictures (figure 8). The speech is engaging and casual, which helps the user feel comforted and supported. User ratings are incredibly high, and commenters report having formed an emotional bond with the bot [28].

  • Software:

Design is easy to comprehend and very intuitive. However, the software is one of Wysa’s biggest weaknesses. Options for dialogue are limited and it is not possible to change topics easily. Wysa repeats the same answers when confronted with unexpected questions (figure 9).

 

Figure 9

Chatbot example

Conclusion

In conclusion, evaluating chatbots for depression is a critical step in ensuring that these tools can support people with mental health effectively. Through testing and validation, researchers and developers are able to find out the strengths and the limitations of chabots that are used for mental health illnesses. This way they can guide improvements in their design and how they function. While the chatbots are able to provide valuable resources for individuals who are unable to access professional medical help due their circumstances, it is important to note that they are not a replacement for professional help. There is improvement needed considering the effectiveness of these conversations. Chatbots aren’t flexible. They will often change topics abruptly, disregard previous information and restrict the answers one can give. Ultimately, the success of these chatbots are that they are able to complement, but not replace, the existing mental health issues.

Our method aimed to evaluate the effectiveness of each of the three chatbots against a fixed criteria. The results show that the average score for Wysa was: 7.32/10, for Youper was: 7.63/10, and for Woebot: 8.84/10; thus, the total relative average score of the three chatbots was: 7.93/10. The findings reveal that the chatbots are moderately effective at providing a supplementary treatment for Major Depressive Disorder. All of the three chatbots prioritised the humanity in the interaction while providing accurate medical recommendations and treatments. The chatbots mainly followed a plan for Cognitive Behaviour Therapy (CBT). Other research involving the evaluation of the versatility and accuracy of chatbots has been made. For instance, the anatomy of Woebot was dissected in order to evaluate the accuracy of the agent-guided CBT for women with postpartum depression [29]. Woebot was also tested out in 2017 at a college campus. The ability of the chatbot to provide a ‘coping’ treatment for some students with depression was assessed by comparing the reduction in Major Depressive Disorder symptoms over two weeks in a cohort of 70 students [30]. Nevertheless, there is improvement needed for the effectiveness of these conversations. Future research could test out the ability of the chatbot to identify specific mental health illnesses, not only limiting the investigation of the chatbots to Major Depressive Disorder. 

 

Citation and References

[1] “Major Depressive Disorder Treatment Near Me,” New Horizons Recovery Centers Ohio. https://newhorizonscentersoh.org/major-depressive-disorder-treatment-near-me-34977

‌[2] World Health Organization, “Depression,” World Health Organization, Sep. 13, 2021. https://www.who.int/news-room/fact-sheets/detail/depression

[3] M. Yuen, “Chatbot market in 2022: Stats, trends, and companies in the growing AI chatbot industry,” Insider Intelligence, Apr. 15, 2022.https://www.insiderintelligence.com/insights/chatbot-market-stats-trends/.

[4] “Mental Health Chatbot,” Woebot, 2021. https://woebothealth.com/

[5] “Wysa – Everyday Mental Health,” Wysa – Everyday Mental Health. https://www.wysa.com/

[6] “Youper: Artificial Intelligence For Mental Health Care,” www.youper.ai. https://www.youper.ai (accessed Feb. 25, 2023).

[7] History: “Jonathan Sperber: Infectious Disease in the Twentieth Century” History Missouri Education, May 16, 2020. https://history.missouri.edu/node/141

[8] World Health Organization, “Depression,” World Health Organization, Sep. 13, 2021. https://www.who.int/news-room/fact-sheets/detail/depression

[9] Johns Hopkins Medicine, “Mental Health Disorder Statistics,” John Hopkins Medicine, 2019. https://www.hopkinsmedicine.org/health/wellness-and-prevention/mental-health-disorder-statistics

[10] University of Auckland Psychology Department: WRITING A PSYCHIATRIC CASE HISTORY (Year One) Guide and General Instructions https://www.fmhs.auckland.ac.nz/assets/fmhs/som/psychmed/docs/writing_a_psychiatry_case_study.pdf

[11] Mayo Clinic, “Depression (Major Depressive Disorder),” Mayo Clinic, Oct. 14, 2022.https://www.mayoclinic.org/diseases-conditions/depression/symptoms-causes/syc-20356007

[12] J. Elmer, “Not Sure What to Say to Someone with Depression? Here Are 7 Ways to Show Support,” Healthline, May 07, 2019. https://www.healthline.com/health/what-to-say-to-someone-with-depression

[13] I. Staff, 2018 IEEE ACM 11th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). 2018.

[14] E. Adamopoulou and L. Moussiades, “An Overview of Chatbot Technology,” IFIP Advances in Information and Communication Technology, vol. 584, pp. 373–383, 2020, doi: https://doi.org/10.1007/978-3-030-49186-4_31.

[15] F. Torres, “What Is Depression?,” American Psychiatric Association, Oct. 2020. https://www.psychiatry.org/patients-families/depression/what-is-depression

[16] Center for Devices and Radiological Health, “Breakthrough Devices Program,” U.S. Food and Drug Administration, 2019. https://www.fda.gov/medical-devices/how-study-and-market-your-device/breakthrough-devices-program

[17] C. Melore, “Chatbots or real people? Study finds customers only care about ‘perceived humanness,’” Study Finds, Jan. 06, 2022. https://studyfinds.org/chatbots-or-people-humanness/

[18] McCrae, R.R. and John, O.P., “An Introduction to the Five-Factor Model and Its Applications,” Journal of Personality, vol. 60, no. 2, pp. 175–215, 1992.

‌[19] IBM, “What is Natural Language Processing? | IBM,” www.ibm.com. https://www.ibm.com/topics/natural-language-processing

[20] “Visiativ Chatbot Solutions – Measuring chatbot effectiveness,” www.visiativ.com, Mar. 21, 2022. https://www.visiativ.com/en/actualites/news/measuring-chatbot-effectiveness

‌[21] “WER | Calculate the Word Error Rate with our Tool,” Amberscript, Jan. 19, 2021. https://www.amberscript.com/en/wer-tool/

[22] E. Paikari and A. van der Hoek, “A Framework for Understanding Chatbots and Their Future,” IEEE Xplore, May 01, 2018. https://ieeexplore.ieee.org/document/8445528 (accessed Feb. 25, 2023).

[23] G. Thornicroft et al., “Undertreatment of people with major depressive disorder in 21 countries,” British Journal of Psychiatry, vol. 210, no. 2, pp. 119–124, Feb. 2017, doi: https://doi.org/10.1192/bjp.bp.116.188078.

[24] “Mental Health Chatbot,” Woebot, 2021. https://woebothealth.com/

[25] How Imaginative Practices Are Key To A Great Night’s Sleep. (2021, April 15). Bedfolk. https://bedfolk.com/blogs/the-wind-down/how-imaginative-practices-are-key-to-a-great-night-s-sleep.

[26] Moth, D. (2013, March 12). 85% of consumers favour apps over mobile websites. Econsultancy. https://econsultancy.com/85-of-consumers-favour-apps-over-mobile-websites/

[27] “Youper:” Artificial Intelligence For Mental Health Care. (n.d.-b). https://www.youper.ai/

[28]“‎Wysa: Mental Health Support,” App Store. https://apps.apple.com/us/app/wysa-mental-health-support/id1166585565‌

[29] A. Darcy et al., “Anatomy of a Woebot® (WB001): agent guided CBT for women with postpartum depression,” Expert Review of Medical Devices, vol. 19, no. 4, pp. 287–301, Apr. 2022, doi: https://doi.org/10.1080/17434440.2022.2075726.

[30]‌ K. K. Fitzpatrick, A. Darcy, and M. Vierhile, “Delivering Cognitive Behavior Therapy to Young Adults With Symptoms of Depression and Anxiety Using a Fully Automated Conversational Agent (Woebot): A Randomised Controlled Trial,” JMIR Mental Health, vol. 4, no. 2, p. e19, Jun. 2017, doi: https://doi.org/10.2196/mental.7785.