Quantcast
Channel: Machine Learning | Towards AI
Viewing all articles
Browse latest Browse all 791

The Design Shift: Building Applications in the Era of Large Language Models

$
0
0
Author(s): Jun Li Originally published on Towards AI. Photo by Martin Martz on Unsplash A new trend has recently reshaped our approach to building software applications: the rise of large language models (LLMs) and their integration into software development. LLMs are much more than just another new tool in our toolkit; they represent a significant shift towards creating applications that can understand, interact with, and respond to human language in remarkably intuitive ways. Luckily, as technology evolves, it becomes much easier for engineers to access new technologies as black boxes. We don’t need to know precisely the complex algorithms and logic inside to apply them to our applications. In this article, I aim to share my experience as a software engineer using a question-driven approach to designing an LLM-powered application. The design is similar to a traditional application but considers LLM-powered application-specific characters and components. Let’s look at LLM-powered application characters first. LLM-powered application characters I’ll compare traditional and LLM-powered applications’ characters from input and output perspectives since the input-output problem is the core of a software application that it essentially deals with. Generally, a traditional application has structured inputs that require users to input data in a predefined format, such as filling out forms or selecting options. It usually needs more context understanding and struggles to interpret inputs outside its programmed scope, leading to rigid interaction flows. An LLM-powered application can process inputs in natural language, including text and voice, allowing for more intuitive user interactions. It can understand inputs with a broader context, adapt to user preferences, and handle ambiguities in natural language. It can additionally gather inputs from various sources beyond direct user interaction, such as documents and web content, interpreting them to generate actionable insights. In output characters, a traditional application usually has structured outputs that deliver information or responses in a fixed format, which can limit the flexibility in presenting data and engaging users. It has limited personalization that is often rule-based and requires predefined user segments or profiles. The outputs and user flows are designed and programmed in advance, offering limited adaptability to user behaviour or preferences. An application powered by LLM generates natural language outputs that can be customized to suit the context of the interactions, user preferences, and query complexity. It can dynamically personalize responses and content based on ongoing interactions, user history, and inferred user needs. Additionally, it can provide response formats like JSON, according to the users’ requirements from the queries, which can be used for further actions such as function calling. Furthermore, it can create flexible user flows that adapt in real-time to the user’s input, questions, or actions, making the interaction experience more intuitive. LLM-powered application core structure From the LLM-powered application characters, we can conclude that essentially, the LLM-powered application collects inputs in natural language and lets the LLMs generate outputs; we use various ways to improve the inputs to make the LLMs generate better outputs and transform the outputs as needed to meet our business requirements. So we can abstract a general LLM-powered application with a core structure comprising components: ‘Inputs’, ‘Input Processing’, ‘LLM Integration/Orchestration’, and ‘Output Processing and Formatting’, as the diagram shows below. LLM-powered application core structure — Diagram by the author Inputs Inputs are read as natural language, usually a question, instruction, query, or command that specifies what the user wants from the model. These inputs are called prompts and are designed for LLMs like GPT to generate responses. Input processing This step processes and transforms the inputs, formats the requests into more structured inputs, and crafts the prompts using prompt engineering practices to guide the model in generating more accurate and meaningful responses. In some applications, it retrieves from external data sources alongside the original queries for the models to generate more precise and niche-targeting responses, called Retrieval-Augmented Generation (RAG). LLM integration/orchestration It involves the technical integration with the LLM, managing the interaction, and orchestrating the flow of information to and from the model. It sends processed inputs to the LLM and handles the model’s outputs. It controls the routes to the LLMs and chains one or multiple models to generate optimized responses. Output processing and formatting This stage involves refining the LLM’s raw outputs into a presentable and useful format for the end-user, which may include formatting text, summarizing information, or converting outputs into actionable insights. Design LLM-powered application When designing the LLM-powered application, we can expand the core structure described above with the corresponding components, sub-applications/systems to meet the complex real-world application requirements. The application can usually be a chatbot that supports interactive conversations, a copilot that can offer assistance on other applications, or an agent acting autonomously or semi-autonomously to perform tasks, make decisions, or provide recommendations based on interpreting large volumes of data. When I design an application, I prefer to use a set of questions that help me to create a design that meets business requirements. I plan to use the same approach to design the LLM-powered application while keeping the core structure in mind. To illustrate this, I will walk you through the design of an example application called the “Smart Engineering Knowledge Assistant”. This application aims to assist engineers and developers more accurately in queries using natural language with extensive technical knowledge, including code examples, API usages and documentation outside or within an organization like a corporation. Additionally, it will offer the ability to generate code and interact with APIs based on the insights gained. Now, let’s get started on the journey. Q1: How can the end users interact with the application? This question maps to the “Inputs” component in the core structure. Based on the application requirements, the application will provide a conversational UI like a chatbot that is easy for engineers to interact with using natural language. We can offer some predefined keywords as different patterns the application can recognize to generate corresponding prompts for LLMs or select a specific model. The users will see responses generated from LLMs in the UI […]

Viewing all articles
Browse latest Browse all 791

Trending Articles