Production RAG Architecture: Building Scalable Vector Search Systems

Three months into production, our RAG system started failing at 2AM. Not gracefully—complete outages. The problem wasn’t the models or the embeddings. It was the architecture. After rebuilding it twice, here’s what I learned about building RAG systems that actually work in production. Figure 1: Production RAG Architecture Overview The Night Everything Broke It was […]

Read more →

CDA (Clinical Document Architecture): The XML Standard for Medical Documents

What is CDA and Why It Matters CDA Document Structure Sample CDA Document Structure .NET CDA Parsing Implementation CDA Document Generation Common CDA Sections (C-CDA) CDA vs FHIR Documents Standards and References Related Articles in This Series Conclusion

Read more →

Azure Machine Learning: A Solutions Architect’s Guide to Enterprise MLOps

The journey from experimental machine learning models to production-ready AI systems represents one of the most challenging transitions in modern software engineering. Having spent over two decades architecting enterprise solutions, I’ve witnessed the evolution from manual model deployment to sophisticated MLOps platforms. Azure Machine Learning stands at the forefront of this transformation, offering a comprehensive […]

Read more →

Difference between workload managed identity, Pod Managed Identity and AKS Managed Identity

Azure Kubernetes Service(AKS) offers several options for managing identities within Kubernetes clusters, including AKS Managed Identity, Pod Managed Identity, and Workload Managed Identity. Here’s a comparison of these three options: Key Features AKS Managed Identity Pod Managed Identity Workload Managed Identity Overview A built-in feature of AKS that allows you to assign an Azure AD […]

Read more →

Private Kubernetes cluster in AKS with Azure Private Link

Today, we’ll take a look at a new feature in AKS called Azure Private Link, which allows you to connect to AKS securely and privately over the Microsoft Azure backbone network. In the past, connecting to AKS from an on-premises network or other virtual network required using a public IP address, which posed potential security […]

Read more →

Running LLMs on Kubernetes: Production Deployment Guide

Deploying LLMs on Kubernetes requires careful planning. After deploying 25+ LLM models on Kubernetes, I’ve learned what works. Here’s the complete guide to running LLMs on Kubernetes in production. Figure 1: Kubernetes LLM Architecture Why Kubernetes for LLMs Kubernetes offers significant advantages for LLM deployment: Scalability: Auto-scale based on demand Resource management: Efficient GPU and […]

Read more →