Azure Tips: Service Bus vs Azure Queue

Azure Service bus is a queuing technology that supports advanced features and allows access by processes created using various technologies and running in different domains. It allows ability to publish a message to multiple subscribers.  read more Azure Queue is another queuing technology. However, it does not support the ability to publish a message to […]

Read more →

.NET Core 1.0.1 Update (September 2016) Available

Microsoft .NET Core team has released an update to .NET Core 1.0, versioned as “.NET Core 1.0.1”. Read more detailed updates from Microsoft Developer Announcement Blog: Announcing September 2016 Updates for .NET Core 1.0 You can read the release notes for .NET Core, ASP.NET Core and Entity Framework 1.0.1 to learn about the specific changes that […]

Read more →

LLM Inference Optimization: From KV Cache to Speculative Decoding

Introduction: LLM inference optimization is the art of making models respond faster while using fewer resources. As LLMs grow larger and usage scales, the difference between naive and optimized inference can mean 10x cost reduction and sub-second latencies instead of multi-second waits. This guide covers the techniques that matter most: KV cache optimization to avoid […]

Read more →