Technology Engineering – Page 8 – C4: Container, Code, Cloud & Context

LLM Fine-Tuning Techniques: From LoRA to Full Parameter Training

Posted on February 28, 2025 by Nithin Mohan TK 19 min read

Introduction: Fine-tuning transforms general-purpose LLMs into specialized models that excel at your specific tasks. While prompting can get you far, fine-tuning unlocks capabilities that prompting alone cannot achieve: consistent output formats, domain-specific knowledge, reduced latency from shorter prompts, and behavior that would require extensive few-shot examples. This guide covers the practical aspects of LLM fine-tuning: […]

Read more →

The Rise of GitOps: Automating Deployment and Improving Reliability

Posted on February 16, 2025 by Nithin Mohan TK 11 min read

GitOps is a relatively new approach to software delivery that has been gaining popularity in recent years. It is a set of practices for managing and deploying infrastructure and applications using Git as the single source of truth. In this blog post, we will explore the concept of GitOps, its key benefits, and some examples […]

Read more →

Batch Inference Optimization: Maximizing Throughput and Minimizing Costs

Posted on February 8, 2025 by Nithin Mohan TK 18 min read

Introduction: Batch inference optimization is critical for cost-effective LLM deployment at scale. Processing requests individually wastes GPU resources—the model loads weights once but processes only a single sequence. Batching multiple requests together amortizes this overhead, dramatically improving throughput and reducing per-request costs. This guide covers the techniques that make batch inference efficient: dynamic batching strategies, […]

Read more →

GitOps with a comparison between Flux and ArgoCD and which one is better for use in Azure AKS

Posted on February 6, 2025 by Nithin Mohan TK 4 min read

GitOps has emerged as a powerful paradigm for managing Kubernetes clusters and deploying applications. Two popular tools for implementing GitOps in Kubernetes are Flux and ArgoCD. Both tools have similar functionalities, but they differ in terms of their architecture, ease of use, and integration with cloud platforms like Azure AKS. In this blog, we will […]

Read more →

LLM Monitoring and Alerting: Building Observability for Production AI Systems

Posted on February 3, 2025 by Nithin Mohan TK 20 min read

Introduction: LLM monitoring is essential for maintaining reliable, cost-effective AI applications in production. Unlike traditional software where errors are obvious, LLM failures can be subtle—degraded output quality, increased hallucinations, or slowly rising costs that go unnoticed until the monthly bill arrives. Effective monitoring tracks latency, token usage, error rates, output quality, and cost metrics in […]

Read more →

Structured Output from LLMs: JSON Mode, Function Calling, and Pydantic Patterns (Part 1 of 2)

Posted on February 2, 2025 by Nithin Mohan TK 12 min read

Introduction: Getting reliable, structured data from LLMs is one of the most practical challenges in building AI applications. Whether you’re extracting entities from text, generating API parameters, or building data pipelines, you need JSON that actually parses and validates against your schema. This guide covers the evolution of structured output techniques—from prompt engineering hacks to […]

Read more →

Searching in

Category: Technology Engineering

LLM Fine-Tuning Techniques: From LoRA to Full Parameter Training

The Rise of GitOps: Automating Deployment and Improving Reliability

Batch Inference Optimization: Maximizing Throughput and Minimizing Costs

GitOps with a comparison between Flux and ArgoCD and which one is better for use in Azure AKS

LLM Monitoring and Alerting: Building Observability for Production AI Systems

Structured Output from LLMs: JSON Mode, Function Calling, and Pydantic Patterns (Part 1 of 2)