Description:
The LLM & RAG Solutions Architect at BlackStone eIT will be responsible for designing and implementing solutions that leverage Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) techniques. This role focuses on creating innovative solutions that enhance data retrieval, natural language processing, and information delivery for our clients.
Responsibilities:
• Develop architectures that incorporate LLM and RAG technologies to improve client solutions.
• Collaborate with data scientists, engineers, and business stakeholders to understand requirements and translate them into effective technical solutions.
• Design and implement workflows that integrate LLMs with existing data sources for enhanced information retrieval.
• Evaluate and select appropriate tools and frameworks for building and deploying LLM and RAG solutions.
• Conduct research on emerging trends in LLMs and RAG to inform architectural decisions.
• Ensure the scalability, security, and performance of LLM and RAG implementations.
• Provide technical leadership and mentorship to development teams in LLM and RAG best practices.
• Develop and maintain comprehensive documentation on solution architectures, workflows, and processes.
• Engage with clients to communicate technical strategies and educate them on the benefits of LLM and RAG.
• Monitor and troubleshoot implementations to ensure optimal operation and address any arising issues.
Requirements
Resource Requirement – AI/Multi-Agent Chatbot Architect (RAG & On-Prem LLM)
We are looking to onboard a specialized technical resource with the following expertise:
- Proven Experience in Multi-Agent Chatbot Architectures:
Hands-on experience designing and implementing multi-agent conversational systems that allow for scalable, modular interaction handling. - On-Premise LLM Integration:
Demonstrated capability in deploying and integrating large language models (LLMs) in on-premise environments, ensuring data security and compliance. - RAG (Retrieval-Augmented Generation) Implementation:
Prior experience in successfully implementing RAG pipelines, including knowledge of embedding strategies, vector databases, document chunking, and query optimization. - RAG Optimization:
Deep understanding of optimizing RAG systems for performance and relevance, including latency reduction, caching strategies, embedding quality improvements, and hybrid retrieval techniques.
Optional but preferred:
- Familiarity with open-source LLMs (e.g., LLaMA, Qwen, Mistral, Falcon)
- Experience with vector DBs such as VectorDB, FAISS, Weaviate, Qdrant, etc.
- Workflow orchestration using frameworks like LangChain, LlamaIndex, Haystack, etc.
Benefits
- Paid Time Off
- Performance Bonus
- Training & Development