slack-bolt

1 posts

line

Implementing a RAG-Based Bot to (opens in new tab)

To address the operational burden of handling repetitive user inquiries for the AWX automation platform, LY Corporation developed a support bot utilizing Retrieval-Augmented Generation (RAG). By combining internal documentation with historical Slack thread data, the system provides automated, context-aware answers that significantly reduce manual SRE intervention. This approach enhances service reliability by ensuring users receive immediate assistance while allowing engineers to focus on high-priority development tasks. ### Technical Infrastructure and Stack * **Slack Integration**: The bot is built using the **Bolt for Python** framework to handle real-time interactions within the company’s communication channels. * **LLM Orchestration**: **LangChain** is used to manage the RAG pipeline; the developers suggest transitioning to LangGraph for teams requiring more complex multi-agent workflows. * **Embedding Model**: The **paraphrase-multilingual-mpnet-base-v2** (SBERT) model was selected to support multi-language inquiries from LY Corporation’s global workforce. * **Vector Database**: **OpenSearch** serves as the vector store, chosen for its availability as an internal PaaS and its efficiency in handling high-dimensional data. * **Large Language Model**: The system utilizes **OpenAI (ChatGPT) Enterprise**, which ensures business data privacy by preventing the model from training on internal inputs. ### Enhancing LLM Accuracy through RAG and Vector Search * **Overcoming LLM Limits**: Traditional LLMs suffer from "hallucinations," lack of up-to-date info, and opaque sourcing; RAG fixes this by providing the model with specific, trusted context during the prompt phase. * **Embedding and Vectorization**: Textual data from wikis and chats are converted into high-dimensional vectors, where semantically similar phrases (e.g., "Buy" and "Purchase") are stored in close proximity. * **k-NN Retrieval**: When a user asks a question, the bot uses **k-Nearest Neighbors (k-NN)** algorithms to retrieve the top *k* most relevant snippets of information from the vector database. * **Contextual Generation**: Rather than relying on its internal training data, the LLM generates a response based specifically on the retrieved snippets, leading to higher accuracy and domain-specific relevance. ### AWX Support Bot Workflow and Data Sources * **Multi-Source Indexing**: The bot references two main data streams: the official internal AWX guide wiki and historical Slack inquiry threads where previous solutions were discussed. * **Automated First Response**: The workflow begins when a user submits a query via a Slack workflow; the bot immediately processes the request and provides an initial AI-generated answer. * **Human-in-the-Loop Validation**: After receiving an answer, users can click "Issue Resolved" to close the ticket or "Call AWX Admin" if the AI's response was insufficient. * **Efficiency Gains**: This tiered approach filters out "RTFM" (Read The F***ing Manual) style questions, ensuring that human administrators only spend time on unique or complex technical issues. Implementing a RAG-based support bot is a highly effective strategy for SRE teams looking to scale their internal support without increasing headcount. For the best results, organizations should focus on maintaining clean internal documentation and selecting embedding models that reflect the linguistic diversity of their specific workforce.