Introduction to Bixby for Developers

Samsung has released Bixby, a next generation conversational “virtual assistant” platform. You can download the Bixby Developer studio now, but before doing something with it, you have to understand some basics of Bixby. I hope this post can help you to understand those basics.

At client side, imagine Bixby as a personal virtual assistant app who can help to automate “tasks”, like ordering lunch and booking taxi to airport. We can do those tasks manually by opening food and taxi apps, but with Bixby, just tell him using natural language (text or voice) and he will make it done. We can program Bixby to understand concepts and perform actions based on natural language inputs. To interact with Bixby at client side, developers can customize presentation layer of Bixby app by developing Views (for UI layout and component) and Dialogs (for voice and text I/O).

The core capabilities of Bixby at server side called Capsules. It is nothing to do with Hinton’s capsule nets even though it may inspired by. Capsule is an “programming abstraction” of what Bixby capable to do. Developers can develop new capsules and publish (or sell) to others in a market place. There are two building blocks of a capsule, Concept and Action. Developing a capsule means writing codes for concepts and actions. Imagine “concepts” as data types and structures (or class) in C/C++ programming. A concept can be primitive type, which is simple and pre-defined, or user defined structures, like food, taxi and airport, which are more complex with custom properties. Like in C/C++, instantiation of a concept called object. An Action defines an operation of what a capsule can do. Imagine Action like interface specification in C/C++, it has definition of input and output. However, Actions in Bixby capsule have more features like validation, error handling and can be used to call APIs (with JavaScript functions), to configure endpoints, to integrate with other service like Samsung Payment Service (SPS). Programming capsules in Bixby is all about modelling, implementing, and testing of Concepts and Actions. The good news is, we can use JavaScript instead of C/C++ .

Developing Bixby capsules is quite different with mobile apps. Instead of writing implementation codes (UI, logic, etc) , we are writing codes to perform conversational modelling, which is how we teach Bixby about features or domain in a conversational system. Think back to a personal assistant. Before he can do something for us, we have to model his capabilities (or capsules), to understand what we say and perform actions. To interact with this personal assistant we use natural language inputs (voice, text), that means we have to plan how we communicate with him. Bixby comes with NLU (Natural Language Understanding) platform to recognize natural language utterances from users. In other words, Bixby can convert unstructured natural language into a structured intent. How to teach Bixby? By showing examples. Developers needs to provide training examples that consist of sample utterances annotated to connect words and phrases in capsule’s concepts and actions, aligning them to an intent. This is not like memorizing words and intents. There is AI (machine learning and deep learning) algorithms inside Bixby platform to automatically learn the most important aspects of those examples, capture the knowledge and apply that to new, unseen words and sentences. That AI performs Named Entity Recognition (NER) and Intent Classification tasks for developers without have to implement AI algorithms itself.

In summary, this is how Bixby works. First, Bixby takes voice or text inputs from users, and convert to structured intents based on examples. In this case developers have to provide natural language utterances to train Bixby how to understand the user and generate an intent. Second, Bixby uses the intents and the models (concepts and actions) to dynamically generate a program (or plan) to be executed. Third, Bixby executes the program plan dynamically using DPG (dynamic program generation). When Bixby reaches specific actions, it can optionally executes JavaScript functions that developers wrote. Means that, developers can connect Bixby to any backend APIs. Sounds interesting? Now you are ready to download and play with Bixby Studio to create your first Bixby capsules!

TSMRA, Suwon, 2019.

Author: Risman Adnan

A simple geek who loves codes and creating software.

Leave a Reply

Your email address will not be published. Required fields are marked *