Bots
At its most fundamental level, a bot (a shortened version of the word “robot”) is a term used to describe a program or process that operates as an agent for a user (or another program) and attempts to simulate a human activity. However, in the context of "Conversation as a Platform" what we're really talking about is a conversation bot (a.k.a. “chatbot”, "talkbot", “chatterbot”, "Bot", “IM bot”, "interactive agent", or even the unwieldy “Artificial Conversational Entity”. A conversation bot is a computer program, service, or process that conducts a conversation via auditory or textual methods.
In general conversation bots are essentially just decision trees where the user navigates based on an identified range of choices, typically by leveraging artificial intelligence and machine learning.
ChatBot vs Traditional App
In traditional apps users are given a limited predefined set of choices. Conversation Bots are really just applications designed for users to interact via natural language instead of a series of manual keyboard or other types of input. So, in a traditional application, if you wanted to find photos you would access menus which are pre-defined choices. Maybe access the search menu, which is a predefined choice, and you'd have to support the phrase, or support a selection to find photos.
A Bot simplifies this entire process using the same decision tree. However, a user instead just says find photos, which quickly takes them through all of that menu structure. And the end result is that they find photos.
Bots really are just that simple.
Design Principles
The Bot Framework enables developers to create compelling bot experiences that solve a variety of both consumer needs and business challenges.
If you are building a bot, it is safe to assume that you are expecting users to use it. It is also safe to assume that you are hoping that users will prefer using a bot experience over alternative experiences like apps, websites, phone calls, and other means of addressing their particular needs. In other words, your bot is competing for users' time against things like apps and websites. So, the real question becomes: how can you maximize the odds that your bot will achieve its ultimate goal of attracting and keeping users? It's simply a matter of prioritizing the right factors when designing your bot.
As with all software design and development—including the design and implementation of bots—a user's overall experience should be the top priority.
Conversation Flow
Like apps and websites, bots have a UI, but it is made up of dialogs, rather than screens. Dialogs enable the bot developer to logically separate various areas of bot functionality and guide conversational flow. For example, you may design one dialog to contain the logic that helps the user browse for products and another separate dialog to contain the logic that helps the user create a new order.
Dialogs may or may not have graphical interfaces. They may contain buttons, text, and other elements, or be entirely speech-based. Dialogs also contain actions to perform tasks such as invoking other dialogs or processing user input.
In a traditional application, everything begins with the main screen. The main screen invokes the new order screen. The new order screen remains in control until it either closes or invokes other screens. If the new order screen closes, the user is returned to the main screen.
A traditional application flow
In a bot, everything begins with the root dialog. The root dialog invokes the new order dialog. At that point, the new order dialog takes control of the conversation and remains in control until it either closes or invokes other dialogs. If the new order dialog closes, control of the conversation is returned back to the root dialog.
A bot experience flow
Microsoft Bot Framework
Regardless of the tools, platform, and user audience, bot developers all face the same challenges: bots require basic I/O, bots need to have at least basic language and dialog skills, bots need to connect and interact with users in a language understood by the user, and bots need to be available in the services users need them.
To address these complexities, the Microsoft Bot Framework provides a comprehensive set of tools, services, and documentation for bot development—the Microsoft Bot Builder SDK.
In a nutshell the Microsoft Bot Builder SDK provides,
- A powerful, easy-to-use framework that provides a familiar way for .NET and Node.js developers to create bots.
- A set of enhanced features that make interactions between bots and users much simpler.
- An emulator for debugging bots
- A large set of sample bots developers can use as building blocks
Creation and development of chatbots does not technically require use of the Bot Builder SDK, however using the SDK simplifies everything. Getting started with the Bot Builder SDK starts at a single-entry point into bot development—the Bot Connector.
The Bot Connector
As the central entry point for bot communication and processes, the Bot Connector handles communication, conversations, state, and authorization for all activities between a bot and its users, including routing messages, managing state, registering bots, tracking user conversations, managing integrated services (such as translation) and even per-user and per-bot storage.
The Bot Connector facilitates communication between a bot and a user by relaying messages from bot to channel and from channel to bot. Typically, a bot's logic is hosted as a web service, such as an Azure Web App and has the ability to receive messages from users through the Bot Connector “service”. Bot replies are actually sent to the Bot Connector using standard HTTPS POST methods, and are “wrapped” by methods exposed by the Bot Framework.
Specifics of working with the Connector vary based on development language.
In C#, for example, the Connector is accessed via the .NET ConnectorClient,
- var connector = new ConnectorClient(new Uri(activity.ServiceUrl));
When using Node.js, you might use the ConsoleConnector when working in development,
- var connector =newbuilder.ConsoleConnector().listen();
When moving to production (or when testing in a local emulator) you might switch the Node.js ChatConnector scenario, and provide authorization values:
- var connector = newbuilder.ChatConnector({
- appId: process.env.MicrosoftAppId,
- appPassword: process.env.MicrosoftAppPassword
- });
The ChatConnector requires an API endpoint to be setup within your bot. With the Node.js SDK, this is usually accomplished by installing the Restify Node.js module. The ConsoleConnector does not require an API endpoint.
Connector Service Flow
Working Parts
- Your bot logic, written in C# or Node.js
- Your bot Web Service, such as an Azure Web App
- The Bot Connector, via the Microsoft Bot Framework
- User interaction via a third-party service (channel) such as Skype or Facebook
The Connector manages communication; however, the Connector does not create the content for communication. The creation of communication content, for example actual messages, is managed by creating and working with activities.
Activity
Simply stated, an activity is any event that occurs between a bot and a user, such as a message being sent, or a notification that something about a conversation has changed. Connector uses an activity object to pass information back and forth between a bot and a channel.
The most common type of activity is message. The message activity type is critical to bot development, as it provides the foundation for all communication content, from simple text strings, to rich, dynamic experience.
The primary, supported activity types are,
- Message - Represents a communication between bot and user.
- Conversation update - Indicates that the bot was added to a conversation, other members were added to or removed from the conversation, or conversation metadata has changed.
- Contact relation update - Indicates that the bot was added or removed from a user's contact list.
- Delete user data - Indicates to a bot that a user has requested that the bot delete any user data it may have stored.
- Typing - Indicates that the user or bot on the other end of the conversation is compiling a response.
- Ping - Represents an attempt to determine whether a bot's endpoint is accessible.
- End of conversation - Indicates the end of a conversation.
- Event - Represents a communication sent to a bot that is not visible to the user.
- Reaction - Indicates that a user has reacted to an existing activity. For example, a user clicks the “Like” button on a message.
So, your bot will send message activities to communicate information to and receive message activities from users. This means all communicated bot content starts with a Microsoft Bot Framework message
Sending a basic message using Node.js might look like this,
- var customMessage =newbuilder.Message(session)
- .text("This is Microsoft Bot Framework blog ")
- .textFormat("plain")
- .textLocale("en-us");
- session.send(customMessage);
Sending a message via C# would look more like this,
- IMessageActivity message = Activity.CreateMessageActivity();
- message.Text ="This is Microsoft Bot Framework blog";
- message.TextFormat ="plain";
- message.Locale ="en-us";
- await connector.Conversations.SendToConversationAsync((Activity)message);
At the most basic level, messages may simply consist of plain text, however the Microsoft Bot Framework supports the creation of messages containing enhanced content such as text to be spoken, suggested actions, media attachments, rich cards, and even channel-specific data.
Suggested Action
Suggested actions enable your bot to present pre-defined choices (based on what a bot already knows) as buttons, that the user can tap. Suggested actions enhance the user experience by enabling the user to answer a question or make a selection with a simple tap of a button, rather than having to type a response with a keyboard.
In C#, your code might look like this,
- var reply = activity.CreateReply("Thank you for expressing interest in our premium exterior paint colors! What color of paint would you like?");
- reply.Type = ActivityTypes.Message;
- reply.TextFormat = TextFormatTypes.Plain;
- reply.SuggestedActions = new SuggestedActions() {
- Actions = new List < CardAction > () {
- new CardAction() {
- Title = "Mauve", Type = ActionTypes.ImBack, Value = "Mauve"
- },
- new CardAction() {
- Title = "Cornflower Blue", Type = ActionTypes.ImBack, Value = "CornflowerBlue"
- },
- new CardAction() {
- Title = "Aztec Pink", Type = ActionTypes.ImBack, Value = "AztecPink"
- }
- }
- };
In node.js, your code might look like this,
- var msg = newbuilder.Message(session).text("Thank you for expressing interest in our premium exterior paint colors! What color of paint would you like?").suggestedActions(builder.SuggestedActions.create(session, [
- builder.CardAction.imBack(session, "productId=1&color=mauve", "Mauve"),
- builder.CardAction.imBack(session, "productId=1&color=cornflowerblue", "Cornflower Blue"),
- builder.CardAction.imBack(session, "productId=1&color=aztecpink", "Aztec Pink")
- ]));
- session.send(msg);
Note
In the Microsoft Bot Framework, imBack method will post the value to the chat window of the channel you are using. You can use the postBack method, which will post the selection back to your bot, but will not display the selection in the chat window.
Media Attachment
A message exchanged between a user and a bot can contain images, videos, audio content, and even actual files, via the Attachments property.
The Attachments property of a message contains an array of Attachment objects that represent the media attachments and rich cards to display within a message, and writing this code would look very familiar to anyone who's worked with email or other messaging systems.
- Media and Files - Send files such as images, audio and video
- Cards - Send a rich set of visual cards via JSON payload
- Hero Cards - Send a rich card containing a single large image, one or more buttons, and text.
In C#, adding an image as an attachment to a message requires an accessible location (URL), content type, and a display name,
- replyMessage.Attachments.Add(new Attachment() {
- ContentUrl = "https://www.traininglabs.io/bot-images/botimage.png",
- ContentType = "image/png",
- Name = "botimage.png"
- });
In Node.js, it is almost the same code,
- var attachment = msg.attachments[0];
- session.send({
- attachments: [{
- contentUrl: "https://www.traininglabs.io/bot-images/botimage.png",
- contentType: "image/png",
- name: "botimage.png"
- }]
- });
The Microsoft Bot Framework goes beyond simple image and file-based content and uses the powerful and flexible concept of cards.
Cards
A card (rich cards) comprises a title, description, link, and images. Some channels, such as Skype and Facebook, support sending these rich graphical cards to users, which can include interactive buttons A message can contain multiple rich cards, displayed in either list format or carousel format.
The Bot Framework currently supports eight (8) types of rich cards,
- Adaptive
A customizable card that can contain any combination of text, speech, images, buttons, and input fields. See per-channel support.
- Animation
A card that can play animated GIFs or short videos.
- Audio
A card that can play an audio file.
- Hero
A card that typically contains a single large image, one or more buttons, and text.
- Thumbnail
A card that typically contains a single thumbnail image, one or more buttons, and text.
- Receipt
A card that enables a bot to provide a receipt to the user. It typically contains the list of items to include on the receipt, tax and total information, and other text.
- Sign in
A card that enables a bot to request that a user sign-in. It typically contains text and one or more buttons that the user can click to initiate the sign-in process.
- Video
A card that can play videos.
To determine the type of rich cards that a channel supports and see how the channel renders rich cards, you can use the Channel or consult the channel's documentation for information about limitations on the contents of cards.
The Bot Connector handles communication between a bot and a third-party service (channel) and is also responsible for managing and normalizing messages. This means the Bot Connector will render these cards using schema native to the channel, providing supporting cross-platform communication.
If the channel does not support cards, such as SMS, the Bot Framework will do its best to render a reasonable experience to users.
Creation of a rich card is as simple as spinning up a card type, and adding it to the Attachments array, just like any other message attachment.
Creating a Thumbnail Card in C#,
- ThumbnailCard plCard = new ThumbnailCard()
- {
- Title = $ "I'm a thumbnail card about {cardContent.Key}",
- Subtitle = $ "{cardContent.Key} Wikipedia Page",
- Images = cardImages,
- Buttons = cardButtons
- };
- Attachment plAttachment = plCard.ToAttachment();
- replyToConversation.Attachments.Add(plAttachment)
Receipt Card
- ReceiptCard plCard = new ReceiptCard()
- {
- Title = "I'm a receipt card, isn't this bacon expensive?",
- Buttons = cardButtons,
- Items = receiptList,
- Total = "112.77",
- Tax = "27.52"
- };
- Attachment plAttachment = plCard.ToAttachment();
- replyToConversation.Attachments.Add(plAttachment);
And a Sign In Card,
- SigninCard plCard = new SigninCard(title:"You need to authorize me", button: plButton);
- Attachment plAttachment = plCard.ToAttachment();
- replyToConversation.Attachments.Add(plAttachment);
In the real world, bot experiences are part of a full conversation, with potentially complex relationships between messages, replies, and actions.