Chatbots: Reducing the chatbot conversation design feedback loop by 99%

Facebook messenger

When I first started working on a Facebook messenger chatbot back in early ’18, one of the key issues we faced was how inflexible it was for business to make changes to chatbot conversations. In fact, changes took anywhere from 2-3 days at best, to a month at worst — a pretty unbearably long feedback loop.

Through a series of major changes to the underlying architecture, this average of 2 weeks was brought down to barely 20 minutes.

Background

One property of the system before the changes introduced was the fact that all conversational logic was locked into the code base. Every user reply, answer and dialogue pathway — along with all possible permutations of each — were hardcoded into a very complicated finite-state machine.

The above meant that every conversational change would have required development time and intervention. Typically us developers were expected to come up with the conversational dialogues based on business use cases, fit these into the code, following which business would review the results, then request for changes. There were 2 ways this last step was handled:

Developers had to drop everything that we were doing to initiate the changes in code. This also included clearing the usual code review and other quality gates, the added burden of long-chains of telephone from business to product owner to technical staff, and a host of other bureaucratic processes. Given that the volume of requests was relatively high, you can imagine the hefty productivity penalties due to constant context switching and scope changes.
All change requests were to be filed as stories and put into the backlog. An elegant use of the backlog, except that sprints were typically 2 weeks long, which meant that changes would have to wait an average of 2 weeks before being deployed into production. If business was lucky, a change request would come in just as we started sprint planning, and changes could be deployed in 2-3 days. If we were unlucky, changes might be deployed as late as one month later. This meant a terrible cost to agility and flexibility.

There was then a very clear trade-off decision at that point in time — agility and flexibility versus productivity.

My question then was this: would it be possible for us to get the best of both worlds by completely eliminating the need to make this tradeoff decision?

Breakthrough

Some reflection brought to light the following realisation: conversation design is not a development expertise. It was clear that we were mixing up the two functional domains. In the same way user experience is not a key development expertise, as Alan Cooper convincingly and humorously explains in The Inmates are Running the Asylum, conversation design requires a set of skills that developers should not be expected to have. Developers can be great at developing, but it doesn’t mean they will also be great at the only loosely related field known as conversing.

Conversation design is not a development expertise.

Based on this understanding, I initiated conversations with the product owner and some internal stakeholders, culminating in the creation of a new role called a conversation designer.

The vision then was this: conversation designers design, while the development team developed the tools that enabled the designing. In one fell swoop developers transformed from being blockers into being enablers.

Implementation

To enable the new role of a conversation designer, several key changes needed to be made to our application infrastructure.

Separation of conversation data from code

We systematically stripped out of the code all of the conversation data, and moved them over into a MongoDB database. These data consist of such elements as text-to-intent mappings, and conditional dialogue flows. The process of removal consisted of first moving the data into JSON data structures, then wholesale exporting these data structures into the MongoDB database when we were sure functionality wouldn’t be compromised.

The above was facilitated greatly by our movement away from the finite-state machine, to a machine learning backed open-sourced chatbot library called Rasa NLU / Core. This library handled the machine learning aspects of chatbot programming for us, packaged behind a clean API interface that we took advantage of in organising our system architecture.

Building a conversation design management interface

The data structures that was spun out then had to be made available to the designers to utilise. To that end we built a simple web-based interface that allowed conversation designers to manage these data structures themselves, first directly in the form of JSON data for the MVP, with the aim to progress toward a more form-based interface.

Automated ETL pipeline For Training and Deployment

To utilise the data to train the models, without needing development intervention, we then had to spin-off a microservice that handled the ETL for the machine learning data. We called this the training microservice. The ETL consisted of the following:

Extraction of the data,

Transforming the JSON data structures into Markdown through a custom library to fit Rasa’s accepted schemas, and finally

Loading it into Rasa NLU / Core to train the intent and dialogue models used directly by the chatbot.

The most important part of this pipeline was that it was wholly managed by the conversation designers. No more unnecessary reliance on the development team to make things happen.

Versioning

While in the course of discussions, the question of versioning also came up. What if users needed to consult older data structures to identify what went wrong in some changes that they made? In the end we decided to go with a basic LIFO approach first, and improve on it later, in order to reduce our time to market.

Educating the conversation Designer

The final challenge was then education: how can conversation designers understand and utilise the rudimentary tools we provided? This meant sitting down with the users of the system, and guiding them on its use; fixing basic UX issues, and teaching them how to resolve more complicated ones. An example for the latter would be us teaching them to validate the JSON data they worked with using off-the-shelf tools before putting it into the system (which would have returned a more cryptic error message). Inelegant, but trade-offs had to be made given a constraint on development time.

Results

After implementation of the above, the conversation design feedback loop was reduced from 2 weeks to barely 20 minutes — a whopping 99% decrease. Conversation designers could implement their dialogue flows, and then immediately deploy and test them in a staging sandbox. Once satisfied, it was just a simple matter of promoting their models into the production environment.

In addition, support tickets practically vanished after the end of the second month, as familiarity with the new tool increased. This meant reduced interruption of development time, lesser context switching, and faster delivery of other features in the backlog.

Basically, the need to trade-off between developer productivity and design agility evaporated. With an upfront investment of roughly 1 man-month to setup the new architecture, we got to have our cake and eat it.

Background#

Breakthrough#

Implementation#

Separation of conversation data from code#

Building a conversation design management interface#

Automated ETL pipeline For Training and Deployment#

Versioning#

Educating the conversation Designer#

Results#