For: Me
Project Goal
This project aimed to create a conversational chatbot that answers questions about me (Paco) by leveraging a Retrieval-Augmented Generation (RAG) system. The goal was to provide accurate and context-aware responses by pulling information from a knowledge base (or ‘corpus’) of documents. This allows the chatbot to respond to a variety of queries dynamically.
You can test it right now on my homepage!
How it was built
The application uses Flask to create a REST API. When a user sends a question to the /chat
endpoint, the app first decides whether to use the ‘auto’ mode (searching all available knowledge sources) or a ‘manual’ mode (where the user pre-selects specific knowledge sources). The core of the system involves using Vertex AI’s RAG capabilities. It retrieves relevant document chunks from the selected knowledge sources, combines them to provide a rich context, and then uses a generative model (like Gemini) to formulate a response. The retrieved context is prepended to the user’s question in the final prompt, ensuring the model is well-informed. The Flask app is containerized using Docker, and is set up to run on a server environment using Gunicorn.
Technologies used
- Flask: For creating the REST API.
- Flask-CORS: To handle Cross-Origin Resource Sharing, allowing for web client communication.
- Python-dotenv: For loading environment variables.
- Vertex AI (Google Cloud): For RAG functionality and generative models.
- Gunicorn: For running the Flask app in a production environment.
- Docker: For containerizing the application.
- Google Cloud Logging: For logging application events.