The Microsoft Azure blog has posts by numerous Azure staffers who are part of the company’s integrated cloud services initiative. This is a highly extensive blog, and contains over 2,500 posts covering product news and features, as well as industry events. Hear from Azure experts and developers about the latest information, insights, announcements, and Azure news in this blog.
Kubernetes is taking the app development world by storm. Earlier this month, we shared that the Azure Kubernetes Service (AKS) was the fastest growing compute service in Azure’s history. Customers like Siemens Healthineers, Finastra, Maersk, and Hafslund are realizing the benefits of using AKS to easily deploy, manage and scale applications without getting into the toil of maintaining infrastructures. As the community and adoption grows, Kubernetes itself is evolving, adding more enterprise-friendly features and extending to more scenarios. The release of production-level support for Windows Server containers is a true testament to the evolution.
Today, we’re excited to announce the preview of Windows Server containers in Azure Kubernetes Service (AKS) for the latest versions, 1.13.5 and 1.14.0. With this, Windows Server containers can now be deployed and orchestrated in AKS enabling new paths to migrate and modernize Windows Server applications in Azure.
Our customers have applications running on Linux and on Windows. The ability to manage Windows and Linux containers side by side in the same Kubernetes cluster with the exact same APIs, tools and support is what you have been asking us to support, which opens an abundance of new scenarios. For example, you can now add Windows node pools to existing Virtual Network; or deploy a Linux container running a reverse proxy or Redis cache and an IIS application in a Windows container in the same Kubernetes cluster and even as part of the same application - all with consistent monitoring experience and deployment pipelines.
Running Windows Server containers in AKS (preview) also means you can keep taking advantage of many existing Azure services and features that are helping make Kubernetes application development and management much easier, such as:
Manage the lifecycle of Linux and Windows containers easily through Azure Container Registry, which pre-stages all container base images. To reduce network latency or meet rigorous compliance needs, Container Registry can automatically geo-replicate the container images to the data center close to where your users are.
Deliver applications faster on any OS with a standardized deployment pipeline. Azure DevOps integration with AKS helps automate validation, testing, canary and ultimately production easily in just a few steps.
Gain insights into the performance and health of your Kubernetes cluster and workloads with a comprehensive monitoring experience using Azure Monitor.
Now is the time to get started with Windows Server containers in Azure Kubernetes Service (preview) and we look forward to your feedback on these new features and experiences! If you are new to Kubernetes, check out these short Kubernetes whiteboard videos with Brendan Burns, one of the co-founders of Kubernetes, so you can learn how it works for both Windows and Linux!
We would like to take the moment to thank every contributor and customer, without whom, today’s announcement would not be possible. We are proud to be part of the broader and vibrant Kubernetes community.
Optimizing compute resource allocation to achieve performance goals while controlling costs can be a challenging balance to strike especially for database workloads with complex usage patterns. To help address these challenges, we are pleased to announce the preview of Azure SQL Database serverless. SQL Database serverless (preview) is a new compute tier that optimizes price-performance and simplifies performance management for databases with intermittent and unpredictable usage. Line-of-business applications, dev/test databases, content management, and e-commerce systems are just some examples across a range of applications that often fit the usage pattern ideal for SQL Database serverless. SQL Database serverless (preview) is also well-suited for new applications with compute sizing uncertainty or workloads requiring frequent rescaling in order to reduce costs. The serverless compute tier enjoys all the fully managed, built-in intelligence benefits of SQL Database and helps accelerate application development, minimize operational complexity, and lower total costs.
SQL Database serverless (preview) automatically scales compute for single databases based on workload demand and bills for compute used per second. Serverless contrasts with the provisioned compute tier in SQL Database which allocates a fixed amount of compute resources for a fixed price and is billed per hour. Over short time scales, provisioned compute databases must either over-provision resources at a cost in order to accommodate peak usage or under-provision and risk poor performance. Over longer time scales, provisioned compute databases can be rescaled, but this solution may require predicting usage patterns or writing custom logic to trigger rescaling operations based on a schedule or performance metrics. This adds to development and operational complexity. In serverless, compute scaling within configurable limits is managed by the service to continuously right-size resources. Serverless also provides an option to automatically pause the database during inactive usage periods and automatically resume when activity returns.
Pay only for compute used
In SQL Database serverless (preview), compute is only billed based on the amount of CPU and memory used per second. While the database is paused only storage is billed, providing additional price optimization benefit.
Consider a line-of-business application or a dev/test database that is idle at night, but needs multi-core bursting headroom throughout the day. In this example, the application is using SQL Database serverless (preview) configured to allow auto-pausing and auto-scaling up to four vcores and has the following usage pattern over a 24 hour period:
As can be seen, database usage corresponds to the amount of compute billed which is measured in units of vcore seconds and sums to around 46k vcore seconds over the 24 hour period. Suppose the compute unit price for the serverless database is around $0.000073/vcore/second. Then the compute bill for this one day period is just under $3.40. This is calculated by multiplying the compute unit price by the total number of vcore seconds accumulated. During this time period the database was auto-paused while idle and enjoyed the benefit of bursting episodes up to 80 percent of four vcores without customer intervention. In this example, the price savings using serverless is significant compared to a provisioned compute database configured with the same four vcore limit.
Note that pricing is discounted for preview. In this example, pricing is based on the East US region in May 2019 and subject to change. For the most up-to-date pricing, please visit the Azure SQL Database pricing page.
When using SQL Database serverless (preview) there are price-performance trade-offs to consider. These trade-offs are related to the compute unit price and the impact on application performance due to compute warm-up after periods of low or idle usage.
Compute unit price
The compute unit price is higher for a serverless database than for a provisioned compute database since serverless is optimized for workloads with intermittent usage patterns. If CPU or memory usage is high enough and sustained for long enough, then the provisioned compute tier may be less expensive.
Compute warm-up after low usage
While a serverless database is online, memory is gradually reclaimed if CPU or memory usage is low enough or long enough. When workload activity returns, disk IO may be required to rehydrate data pages into the SQL buffer pool or query plans may need to be recompiled. This memory management policy to reclaim cache based on low usage is unique to serverless and done to control customer costs, but can impact performance. Memory reclamation based on low usage does not occur in the provisioned compute tier for single databases or elastic pools where this kind of impact can be avoided.
Compute warm-up after pausing
The latency to pause and resume a database is usually around one minute or less during which time the database is offline. After the database is resumed, memory caches need to be rehydrated which adds additional latency before optimal performance conditions return. The idle period that must elapse before auto-pausing occurs can be configured to compensate for this performance impact. Alternatively, auto-pausing can be disabled for workloads sensitive to this impact and still benefit from auto-scaling. Compute minimums are billed while the database is online regardless of usage, and so disabling auto-pausing can increase costs.
Azure SQL Database serverless (preview) is supported in the general purpose tier for single databases.
We continue to expand the Azure Marketplace ecosystem. For this volume, 22 new offers successfully met the onboarding criteria and went live. See details of the new offers below:
Bluefish Editor on Windows Server 2016: Apps4Rent helps you deploy Bluefish Editor on Azure. Bluefish, a free software editor with advanced tools for building dynamic websites, is targeted as a middle path between simple editors and fully integrated development environments.
Corda Opensource VM: R3’s Corda is an open-source blockchain platform that removes costly friction in business transactions by enabling institutions to transact directly using smart contracts and ensures privacy and security.
DataStax Distribution of Apache Cassandra: DataStax offers a simple, cost-effective way to run the Apache Cassandra database in the cloud. DDAC addresses common challenges with adoption, maintenance, and support by streamlining operations and controlling costs.
DataStax Enterprise: DataStax delivers the always-on, active-everywhere, distributed hybrid cloud NoSQL database built on Apache Cassandra. DataStax Enterprise (DSE) makes it easy for enterprises to exploit hybrid and multi-cloud environments via a seamless data layer.
FatPipe WAN Optimization for Azure: Significantly boost wide area network performance with FatPipe WAN optimization, which appreciably increases utilization, providing effective use of bandwidth by caching/compressing that sharply reduces redundant data.
Flexbby One RU Edition: Get a comprehensive solution for complex workflow automation in sales, marketing, service, HR, and legal. Flexbby One is powerful software to help you manage the contract lifecycle, document archiving, procurement, customer service, and more.
Flowmon Collector for Azure: Flowmon Collector serves for collection, storage, and analysis of flow data (NetFlow, IPFIX). Flowmon is a comprehensive platform that includes everything you need to get absolute control over your network through network visibility.
Keycloak Gatekeeper Container Image: Keycloak Gatekeeper is an adapter that integrates with Keycloak authentication supporting access tokens in browser cookie or bearer tokens. This Bitnami Container Image is secure, up-to-date, and packaged using industry best practices.
MIKE Zero: This MIKE modeling suite from DHI A/S helps engineers and scientists who want to model water environments, and includes most of MIKE Powered by DHI's inland and marine software.
System Integrity Management Platform (SIMP) 6.3: SIMP is an open-source framework that can either enhance your existing infrastructure or allow you to quickly build one from scratch. Built on the Puppet product suite, SIMP is designed around scalability, flexibility, and compliance.
2 Hr Workshop: Windows in the Cloud: The planning and knowledge transfer workshop from Steeves gives an overview of the Windows 10 Servicing Model and Lifecycle and should be presented to key stakeholders such as IT management, IT staff, and IT decision makers.
Azure Accelerate: Determine the ROI of moving your workloads into Azure. Azure Accelerate from Blue Chip Consulting will deliver insights into server inventory, financial models, target-state architecture drawings, and detailed cloud roadmaps.
Azure Storage for Archive: 2-Day Implementation: CDW will assist you in enabling an archival solution in Azure, sharing industry-leading practices as well as identifying requirements. CDW will implement and pilot the solution in the production environment.
Azure Tiered Storage: 1-Day Implementation: A highly skilled CDW engineer will assist you in creating storage accounts in Azure for use in conjunction with an on-premises, cloud-enabled storage appliance, resulting in a hybrid cloud storage solution.
CSP Migration: 3-Week Assessment: SHI offers a rapid assessment and migration path for any existing Azure customer to its SHI Cloud Service Provider (CSP) offering. SHI keeps you up and running while ensuring best practices around security and manageability.
CSP Migration: 6-Week Assessment and Migration: Need more time to move? Get this six-week assessment and migration for existing Azure customers to the SHI Cloud Service Provider (CSP) offering. SHI keeps you up and running while ensuring best practices.
Domain Controller in Azure: 1-Day Implementation: CDW will configure up to two Azure IaaS virtual machines with the Microsoft AD DS domain controller role to connect to your existing single forest/single domain AD DS on-premises infrastructure.
Microsoft Azure AI Chatbot Development: This consultation with Cynoteck Technology Solutions will provide suggestions and solutions to help your company identify how to best use chatbots depending on your line of business.
SSO Using ADFS: 2-Day Implementation: CDW’s engineers will install and configure up to two Active Directory Federation Services servers and two ADFS web application proxy servers in a single location, simplifying things for your end users.
Conversational experiences have become the norm, whether you’re looking to track a package or to find out a store’s hours of operation. At Microsoft Build 2019, we highlighted a few customers who are building such conversational experiences using the Microsoft Bot Framework and Azure Bot Service to transform their customer experience.
LaLiga built its own virtual assistant, which allows fans to experience and interact with LaLiga across multiple platforms.
As users become more familiar with bots and virtual assistants, they will invariably expect more from their conversational experiences. For this reason, Bot Framework SDK and tools are designed to help developers be more productive in building conversational AI solutions. Here are some of the key announcements we made at Build 2019:
Bot Framework SDK and tools
The Bot Framework SDK now supports adaptive dialogs (preview). Adaptive dialog dynamically updates conversation flow based on context and events. Developers can define actions, each of which can have a series of steps defined by the result of events happening in the conversation to dynamically adjust to context. This is especially handy when dealing with conversation context switches and interruptions in the middle of a conversation. Adaptive dialog combines input recognition, event handling, model of the conversation (dialog) and output generation into one cohesive, self-contained unit. The diagram below depicts how adaptive dialogs can allow a user to switch contexts. In this example, a user is looking to book a flight, but switches context by asking for weather related information which may influence travel plans.
Developers can compose conversational experiences by stitching together re-usable conversational capabilities, known as skills. Implemented as Bot Framework bots, skills include language models, dialogs, and cards that are reusable across applications. Current skills, available in preview, include Email, Calendar, and Points of Interest.
Within an enterprise using skills you can now integrate multiple sub-bots owned by different teams into a central bot, or more broadly leverage common capabilities provided by other developers. With the preview of skills, developers can create a new bot (from the Virtual Assistant template) and add/remove skills with one command line operation incorporating all dispatch and configuration changes. Get started with skill developer templates (.NET, TS).
Virtual assistant solution accelerator
The Enterprise Template is now the Virtual Assistant Template, allowing developers to build a virtual assistant with out of the box with skills, adaptive cards, typescript generator, updated conversational telemetry and PowerBI analytics, and ARM based automated Azure deployment. It also provides C# template simplified and aligned to ASP.NET MVC pattern with dependency injection. Developers who have already made use of the Enterprise Template and want to use the new capabilities can follow these steps to get started quickly.
The Bot Framework Emulator has released a preview of the new Bot Inspector feature: a way to debug and test your Bot Framework SDK v4 bots on channels like Microsoft Teams, Slack, Cortana, Facebook Messenger, Skype, etc. As you have the conversation, messages will be mirrored to the Bot Framework Emulator where you can inspect the message data that the bot received. Additionally, a snapshot of the bot state for any given turn between the channel and the bot is rendered as well. You can inspect this data by clicking on the "Bot State" element in the conversation mirror. Read more about Bot Inspector.
Language generation (preview)
Streamlines the creation of smart and dynamic bot responses by constructing meaningful, variable, and grammatically correct responses that a bot can send back to the user. Visit the GitHub repo for more details.
We’ve simplified the process of deploying a bot. Using a pre-defined bot framework v4 template, you can create a bot from any published QnA Maker knowledge base. Not only can you now create a complex QnA Maker knowledge base in minutes, but you can now deploy it to supported channels like Teams, Skype, or Slack in minutes.
Language Understanding (LUIS)
Language Understanding has added several features that let developers extract more detailed information from text, so users can now build more intelligent solutions with less effort.
Roles for any entity type
We have extended roles to all entity types, which allows the same entities to be classified with different subtypes based on context.
New visual analytics dashboard
There’s now a more detailed, visually-rich, comprehensive analytics dashboard. It's user-friendly design highlights common issues most users face when designing applications by providing simple explanations on how to resolve them to help users gain more insight into their models’ quality, potential data problems, and guidance to adopt best practices.
Data is ever-changing and different from one end-user to another. Developers now have more granular control of what they can do with Language Understanding, including being able to identify and update models at runtime through dynamic lists and external entities. Dynamic lists are used to append to list entities at prediction time, permitting user-specific information to get matched exactly.
Read more about the new Language Understanding features, available through our new v3 API, in our docs. Customers like BMW, Accenture, Vodafone, and LaLiga are using Azure to build sophisticated bots faster and find new ways to connect with their customers.
With these enhancements, we are delivering value across the entire Microsoft Bot Framework SDKs and tools, Language Understanding, and QnA maker in order to help developers become more productive in building a variety of conversational experiences.
We look forward to seeing what conversational experiences you will build for your customers. Get started today!
This post is part of a 2-part series about how organizations are using Azure Cosmos DB to meet real world needs, and the difference it’s making for them. In part 1, we explored the challenges that led the Microsoft 365 usage analytics team to take action, the architecture of the new solution, and migration of the production workload. In this post, we’ll examine additional implementation details and the outcomes resulting from the team’s efforts.
Finding the right partition key—a critical design decision
After moving to Azure Cosmos DB, the team revisited how data would be partitioned (referred to as “sharding” in MongoDB). With Azure Cosmos DB, each collection must have a partition key, which acts as a logical partition for the data and provides Azure Cosmos DB with a natural boundary for distributing data across partitions. The data for a single logical partition must reside inside a single physical partition. Physical partition management is managed internally by Azure Cosmos DB.
The Microsoft 365 usage analytics team worked closely with the Azure Cosmos DB team to optimize data distribution in a way that would ensure high performance. The team initially tried the same approach as they used with MongoDB, which was using a random GUID as the partition key. However, this required scanning all of the partitions for reads and over allocating resources for writes, making writes fast but reads slow. The team then tried using Tenant ID as the partition key but found that the vast difference in the amount of report data for each tenant made some partitions too hot, which would have required throttling, while others remained cold.
The solution lay in creating a synthetic partition key. In the end, the team solved both the slow read and too hot and too cold issues by grouping 100 documents per tenant ID into a bucket and then using a combination of tenant IDs and bucket IDs as the partition key. The bucket ID loops from 1 to n, where n is a variable and can be adjusted for each report.
Handling four terabytes of new data every day
In one region alone, more than 6 TB of data is stored in Azure Cosmos DB, with 4 TB of that written and refreshed daily. Both of those numbers are continuing to grow. The database consists of more than 50 different collections, and the largest is more than 300 GB in size. It consumes an average of 150,000 request units per second (RU/s) of throughput, scaling this number up and down as needed.
The different collections map closely to the different reports that the system serves, which in turn have different throughput requirements. This design enables the Microsoft 365 usage analytics team to optimize the number of RU/s that are allocated to each collection (and thus to each report), and to elastically scale that throughput up or down on a per-collection and per-report basis.
Built-in, cost-effective scalability and performance
With Azure Cosmos DB, the Microsoft 365 usage analytics team is delivering real-time customer insights with less maintenance, better performance, and improved availability—all at a lower cost. The new usage analytics system can now easily scale to handle future growth in the number of Office 365 commercial customers. All that was accomplished in less than five months, without any service interruptions. “The benefits of moving from MongoDB to Azure Cosmos DB more than justify the effort that it took,” says Guo Chen, Principal Software Development Manager on the Microsoft 365 usage analytics team.
Improved performance and service availability
The team’s use of built-in, turnkey geo-distribution provided a way to easily distribute reads and writes across two regions. Combined with the other work done by the team, such as rewriting the data access layer using the Azure Cosmos DB Core (SQL) API, this enabled the team to reduce the time for the majority of reads from 12 milliseconds to 3 milliseconds. The image below illustrates this performance improvement.
Although this difference may seem negligible in the context of viewing a report, it resulted in significant service improvements. “There are two ways to access reporting data in the usage analytics system: through the Microsoft 365 admin center, and through Microsoft Graph,” explains Xiaodong Wang, a Software Engineer on the Microsoft 365 usage analytics team. “In the past, people complained that the Graph API was too slow. That’s no longer an issue. In addition, service availability is better now because the chances of any query timing-out are reduced.”
The image below shows just how much service availability is improved. The graph illustrates successful API requests divided by the total API requests and shows that the system is now delivering a service availability level of greater than 99.99 percent.
Zero maintenance and administration
Because Azure Cosmos DB is a fully managed service, the Office 365 development team no longer needs to devote one full-time person to database maintenance and administration. Annual certificate maintenance is no longer a burden, and VMs no longer need to be restarted weekly to protect against any compromises in service availability.
“In the past, with MongoDB, we had to allocate core developer resources to administrative management of the data store,” says Shilpi Sinha, Principal Program Manager on the Microsoft 365 usage analytics team. “Now that we are running on a fully managed service, we are able to repurpose developer resources towards adding new customer value instead of managing the infrastructure.”
The Microsoft 365 usage analytics team can now scale database throughput up or down on demand, as needed to accommodate a fluctuating workload that on average, is growing at a rate of 8 percent every three months. By simply adjusting the number of RU/s allocated to each collection, which can be done in the Azure portal or programmatically, the team can easily scale up during heavy data-ingestion periods to handle new reports, and most importantly, to accommodate continued overall growth of Office 365 around the world.
“Today, all we need to do is keep an eye on request unit usage versus what we have budgeted,” says Wang. “If we’re reaching capacity, we can allocate more RU/s in just a few minutes. We don’t have to pay for spare capacity until we need it and more importantly, we no longer need to worry whether we can handle future growth in data volumes or report usage.”
On top of all of those benefits, the Microsoft 365 usage analytics team increased data and reporting volumes while reducing its monthly Microsoft Azure bill for the usage analytics system by more than 13 percent. “After we cut over to Azure Cosmos DB, our monthly Azure expenses decreased by almost 20 percent,” says Chen. “We undertook this project to better serve our customers. Being able to save close to a quarter-million dollars per year—and likely more in the future—is like icing on the cake.”
“Usage analytics are offered as part of the base capability to all Microsoft 365 customers, irrespective of the type of subscription they purchase," said Sinha. "Keeping the costs of operating this service as low as possible contributes to our goal of running the overall Microsoft 365 service as efficiently as possible while at the same time giving our customers new and improved insights into how their people are using our services.”
This post is part of a 2-part series about how organizations are using Azure Cosmos DB to meet real-world needs, and the difference it’s making for them. In this first post we explore the challenges that led the Microsoft 365 usage analytics team to take action, the architecture of the new solution, and migration of the production workload. In part 2, we’ll examine additional implementation details and the outcomes resulting from the team’s efforts.
The challenge: Understanding the behavior of more than 150 million active users
Office 365 is a flagship service within the Microsoft 365 Enterprise solution, with millions of commercial customers and more than 150 million active commercial users each month. Office 365 provides extensive reporting for administrators within each company on how the service is being used including license assignment, product-level usage, user-level activity, site activity, group activity, storage consumption, and more. The Microsoft 365 usage analytics team incrementally adds new reports to cover more Office 365 services.
The telemetry data needed to generate such reports was collected in a system called usage analytics, that until recently ran on the community version of MongoDB. The image below shows the data flow, with an importer web service used to write log streams collected in Azure Blob storage to MongoDB. An OData web service exposes APIs to extract the stored data for both reporting within the Microsoft 365 admin center and for access through Microsoft Graph. Every day, as part of a full daily refresh, several billion rows of data were added to the system.
Each of the primary geographies served by Office 365 has an independent usage analytics repository, all employing a similar architecture. In each geography, data was stored on two MongoDB clusters, with each cluster consisting of up to 50 virtual machines (VMs) hosted in Azure Virtual Machines and running MongoDB. The two clusters in each geography functioned in a primary/backup configuration. Data was written separately to both clusters and under normal operation, all reads were performed on the primary cluster.
Each cluster was designed for a write-heavy workload. To speed writes, sharding of data across individual cluster nodes was done using a random globally unique identifier (GUID) such as a MongoDB shard key. Every day for a few hours, new data from Azure Blob storage was written using a multithreaded importer. Each thread wrote batches of 2,000 records at a time to all cluster nodes and waited for all records to finish before starting on the next batch of 2,000.
Problems and pains
This architecture presented several problems for the Microsoft 365 usage analytics team, ranging from excessive administrative effort and costs to limited performance, reliability, availability, and scalability. Some specific pains included:
Poor performance. Reads were inefficient and reports sometimes timed out because of the use of a random GUID as a shard key required querying all nodes. In addition, during the few hours each day when new data was imported, with writes and reads hitting the primary cluster node during the same time, performance was poor. To make matters worse, if anything failed during a batch write, which often happened due to internal database errors, all 2,000 records had to be written again.
Full-time administration. Maintenance of the MongoDB clusters was manual and time-consuming, requiring human resources to dedicate time towards managing the clusters. This put an unnecessary resource constraint on the team, which would rather use its bandwidth to bring new reports to market. Plus, bugs in MongoDB 3.2 required all servers to be restarted weekly. And renewing the security certificates on each cluster node within the virtual network had to be completed annually, and required an additional two weeks of effort per cluster. During such routine administrative tasks, if an operation failed on one cluster node, the entire cluster was down until the issue was resolved.
High costs. Significant costs were incurred to run the MongoDB backup clusters, which remained idle most of the time. Those costs continued to increase as Office 365 usage grew.
Limited scalability. Less than three years after MongoDB was initially deployed, the largest repository was almost at maximum capacity. Any spare capacity was forecast to run out within six months as more products and reports were added, with no easy way to scale.
While the team was dealing with the architectural limitations of its existing solution, they were looking ahead to a lineup of new, high-scale capabilities that they wanted to enable for customers in the usage analytics space. The team started looking for a new, cost-effective, and low-maintenance solution that would let them move from self-maintained VMs running MongoDB to a fully managed database service.
Geo-distribution on Azure Cosmos DB: The key to an improved architecture
After exploring their options, the team decided to replace MongoDB with Azure Cosmos DB, a fully managed globally-distributed, multi-model database service designed for global distribution and virtually unlimited elastic scalability. The first step was to deploy the needed infrastructure.
In contrast to the primary/backup, two-cluster configuration that it had used with MongoDB, the team took advantage of turnkey global distribution of active data in Azure Cosmos DB. Using multiple Azure regions for data replication provided an easy way to write to any region, read from any region, and better balance the workload across the database instances—all while relying on Azure Cosmos DB to transparently handle active data replication and data consistency.
“True geo-replication had been deemed too hard to do with MongoDB, which is why the previous architecture separately wrote data to both the primary and backup clusters,” says Xiaodong Wang, a Software Engineer on the Microsoft 365 usage analytics team. “With Azure Cosmos DB, implementing transparent geo-distribution literally took minutes—just a few mouse clicks.”
The image below shows the internal architecture of the usage analytics system today. Each of the primary geographies served by Office 365 is served by Cosmos databases geo-replicated across two Azure regions within that geography. Under normal operating conditions, writes are sent to one region within each geography while reads are routed to both. If for some reason a region is prevented from serving reads, those reads are automatically routed to the other region serving that same geography.
Migrating a production workload to Azure Cosmos DB
Developers began writing a new data access layer on the new infrastructure to accommodate reads and writes, using the Azure Cosmos DB SQL (Core) API. After bringing the new system online, the team began to write new production data to both old and new systems, while continuing to serve production reports from the old one.
Developers began to address the reports that they would need to duplicate for the new solution, working through them one at a time. Separate Cosmos containers were created within the database for most reports, so that each collection would be separately scalable after the system came online. The largest reports were addressed first to ensure that Azure Cosmos DB could handle them, and after each new report was verified, the team began serving it from the new environment.
After all functionality and reports were being served by Azure Cosmos DB, and everything was running as it should, the team stopped writing new data to the old system and decommissioned the MongoDB environment. The development team was able to move to Azure Cosmos DB, rewrite the data access layer, and migrate all reports for all geographies without any service interruptions to end users.
In part 2 of this series, we'll cover additional implementation details and the outcomes resulting from the Microsoft 365 usage analytics team’s implementation of Azure Cosmos DB.
Here are additional enhancements to the developer experience, announced at Microsoft Build:
Powering Kubernetes with etcd API
Etcd is at the heart of the Kubernetes cluster - it’s where all of the state is! We are happy to announce a preview for wire-protocol compatible etcd API to enable self-managed Kubernetes developers to focus more on their apps, rather than managing etcd clusters. With the wire-protocol compatible Azure Cosmos DB API for etcd, Kubernetes developers will automatically get highly scalable, globally distributed, and highly available Kubernetes clusters. This enables developers to scale Kubernetes coordination and state management data on a fully managed service with 99.999-percent high availability and elastic scalability backed by Azure Cosmos DB SLAs. This helps significantly lower total cost of ownership (TCO) and remove the hassle and complexity of managing etcd clusters.
The multi-model capabilities of Azure Cosmos DB’s database engine are foundational and bring important benefits to our customers, such as leveraging multiple data models in the same apps, streamlining development by focusing on the single service, reducing TCO by not having multiple database engines to manage, and getting the benefits of the comprehensive SLAs offered by Azure Cosmos DB.
Over the past two years, we have been steadily revamping our database engine’s type system and the storage encodings for both Azure Cosmos DB database log and index. The database engine’s type system is fully extensible and is now a complete superset of the native type systems of Apache Cassandra, MongoDB, Apache Gremlin, and SQL. The new encoding scheme for the database log is highly optimized for storage and parsing, and is capable of efficiently translating popular formats like Parquet, protobuf, JSON, BSON, and other encodings. The newly revamped index layout provides:
Significant performance boost to query execution cost, especially for the aggregate queries
New SQL query capabilities:
Support for OFFSET/LIMIT and DISTINCT keywords
Composite indexes for multi-column sorting
Correlated subqueries including EXISTS and ARRAY expressions
The type system and storage encodings have provided benefits to a plethora of Gremlin, MongoDB, and Cassandra (CQL) features. We are now near full compatibility with Cassandra CQL v4, and are bringing native change feed capabilities as an extension command in CQL. Customers can build efficient, event sourcing patterns on top of Cassandra tables in Azure Cosmos DB. We are also announcing several Gremlin API enhancements, including the support of Execution Profile function for performance evaluation and String comparison functions aligned with the Apache TinkerPop specification.
Added support for Azure Cosmos DB direct HTTPS and TCP transport protocols, increasing performance and availability
All new query improvements of V3 SDKs
Java V3 SDK is fully open-sourced, and we welcome your contributions. We will make Java V3 SDK generally available shortly.
Change feed processor for Java
One of the most popular features in Azure Cosmos DB, change feed allows customers to programmatically observe changes to their data in Cosmos containers. It is used in many application patterns, including reactive programming, analytics, event store, and serverless. We’re excited to announce change feed processor library for Java, allowing you to build distributed microservices architectures on top of change feed, and dynamically scale them using one of the most popular programming languages.
General availability of the cross-platform Table .NET Standard SDK
The 1.0.1 GA version of the cross-platform Table .NET Standard SDK has just come out. It is a single unified cross-platform SDK for both Azure Cosmos DB Table API and Azure Storage Table Service. Our customers can now operate against the Table service, either as a Cosmos Table, or Azure Storage Table using .NET Framework app on Windows, or .NET Core app on multiple platforms. We’ve improved the development experience by removing unnecessary binary dependencies while retaining the improvements when invoking Table API via the REST protocols, such as using modern HttpClient, DelegatingHandler based extensibility, and modern asynchronous patterns. It can also be used by the cross-platform Azure PowerShell to continue to power the Table API cmdlets.
More cosmic developer goodness
ARM support for databases, containers, and other resources in Azure Resource Manager
Azure Cosmos DB now provides support for Databases, Containers and Offers in Azure Resource Manager. Users can now provision databases and containers, and set throughput using Azure Resource Manager templates or PowerShell. This support is available across all APIs including SQL (Core), MongoDB, Cassandra, Gremlin, and Table. This capability also allows customers to create custom RBAC roles to create, delete, or modify the settings on databases and containers in Azure Cosmos DB. To learn more and to get started, see Azure Cosmos DB Azure Resource Manager templates.
Azure Cosmos DB custom roles and policies
Azure Cosmos DB provides support for custom roles and policies. Today, we announce the general availability of an Azure Cosmos DB Operator role. This role provides the ability to manage Azure Resource Manager resources for Azure Cosmos DB without providing data access. This role is intended for scenarios where customers need the ability to grant access to Azure Active Directory Service Principals to manage deployment operations for Azure Cosmos DB, including the account, databases, and containers. To learn more, visit our documentation on Azure Cosmos DB custom roles and policies support.
Upgrade single-region writes Cosmos accounts to multi-region writes
One of the most frequent customer asks has been the ability to upgrade existing Cosmos accounts configured with a single writable region (single-master) to multiple writable regions (multi-master). We are happy to announce that starting today, you will be able to make your existing accounts writable from all regions. You can do so using the Azure portal or Azure CLI. The upgrade is completely seamless and is performed without any downtime. To learn more about how to perform this upgrade, visit our documentation.
Automatic upgrade of fixed containers to unlimited containers
All existing fixed Azure Cosmos containers (collections, tables, graphs) in the Azure Cosmos DB service are now automatically upgraded to enjoy unlimited scale and storage. Please refer to this documentation for in depth overview of how to scale your existing fixed containers to unlimited containers.
Azure Cosmos Explorer now with Azure AD support
Enjoy a flexible Cosmos Explorer experience to work with data within the Azure portal, as part of the Azure Cosmos DB emulator and Azure Storage Explorer. We’ve also made it available “full-screen”, for when developers do not have access to the Azure portal or need a full screen experience. Today, we are adding support for Azure Active Directory to https://cosmos.azure.com, so that developers can authenticate directly with their Azure credentials, and take advantage of the full screen experience.
Azure portal and tools enhancements
To help customers correctly provision capacity for apps and optimize costs on Azure Cosmos DB, we have added built in cost recommendations to Azure portal and Azure Advisor, along with updates to the Azure pricing calculator.
We look forward to seeing what you will build with Azure Cosmos DB!
Network security solutions can be delivered as appliances on premises, as network virtual appliances (NVAs) that run in the cloud or as a cloud native offering (known as firewall-as-a-service).
Customers often ask us how Azure Firewall is different from Network Virtual Appliances, whether it can coexist with these solutions, where it excels, what’s missing, and the TCO benefits expected. We answer these questions in this blog post.
Network virtual appliances (NVAs)
Third party networking offerings play a critical role in Azure, allowing you to use brands and solutions you already know, trust and have skills to manage. Most third-party networking offerings are delivered as NVAs today and provide a diverse set of capabilities such as firewalls, WAN optimizers, application delivery controllers, routers, load balancers, proxies, and more. These third party capabilities enable many hybrid solutions and are generally available through the Azure Marketplace. For best practices to consider before deploying a NVA, see Best practices to consider before deploying a network virtual appliance.
Cloud native network security
A cloud native network security service (known as firewall-as-a-service) is highly available by design. It auto scales with usage, and you pay as you use it. Support is included at some level, and it has a published and committed SLA. It fits into DevOps model for deployment and uses cloud native monitoring tools.
What is Azure Firewall?
Azure Firewall is a cloud native network security service. It offers fully stateful network and application level traffic filtering for VNet resources, with built-in high availability and cloud scalability delivered as a service. You can protect your VNets by filtering outbound, inbound, spoke-to-spoke, VPN, and ExpressRoute traffic. Connectivity policy enforcement is supported across multiple VNets and Azure subscriptions. You can use Azure Monitor to centrally log all events. You can archive the logs to a storage account, stream events to your Event Hub, or send them to Log Analytics or your security information and event management (SIEM) product of your choice.
Is Azure Firewall a good fit for your organization security architecture?
Organizations have diverse security needs. In certain cases, even the same organization may have different security requirements for different environments. As mentioned above, third party offerings play a critical role in Azure. Today, most next-generation firewalls are offered as Network Virtual Appliances (NVA) and they provide a richer next-generation firewall feature set which is a must-have for specific environments/organizations. In the future, we intend to enable chaining scenarios to allow you to use Azure Firewall for specific traffic types, with an option to send all or some traffic to a third party offering for further inspection. This third-party offering can be either a NVA or a cloud native solution.
Many Azure customers find the Azure Firewall feature set is a good fit and it provides some key advantages as a cloud native managed service:
DevOps integration – easily deployed using Azure Portal, Templates, PowerShell, CLI, or REST.
Built in HA with cloud scale.
Zero maintenance service model - no updates or upgrades.
Azure Firewall pricing includes a fixed hourly cost ($1.25/firewall/hour) and a variable per GB processed cost to support auto scaling. Based on our observation, most customers save 30 percent – 50 percent in comparison to an NVA deployment model. We are announcing a price reduction, effective May 1, 2019, for the firewall per GB cost to $0.016/GB (-46.6 percent) to ensure that high throughput customers maintain cost effectiveness. There is no change to the fixed hourly cost. For the most up-to-date pricing information, please go to the Azure Firewall pricing page.
The following table provides a conceptual TCO view for a NVA with full HA (active/active) deployment:
(30%-50% cost saving)
Two plus VMs to meet peek requirements
Per NVA vendor billing model
Standard Public Load Balancer
First five rules: $0.025/hour
Additional rules: $0.01/rule/hour
$0.005 per GB processed
Standard Internal Load Balancer
First five rules: $0.025/hour
Additional rules: $0.01/rule/hour
$0.005 per GB processed
Customers across industries including healthcare, legal, media, and manufacturing are looking for new solutions to solve business challenges with AI, including knowledge mining with Azure Search.
Azure Search enables developers to quickly apply AI across their content to unlock untapped information. Custom or prebuilt cognitive skills like facial recognition, key phrase extraction, and sentiment analysis can be applied to content using the cognitive search capability to extract knowledge that’s then organized within a search index. Let’s take a closer look at how one company, Howden, applies the cognitive search capability to reduce time and risk to their business.
Howden, a global engineering company, focuses on providing quality solutions for air and gas handling. With over a century of engineering experience, Howden creates industrial products that help multiple sectors improve their everyday processes; from mine ventilation and waste water treatment to heating and cooling.
Too many details, not enough time
Every new project requires the creation of a bid proposal. A typical customer bid can span thousands of pages in differing formats such as Word and PDF. The team has to scour through detailed customer requirements to identify key areas of design and specialized components in order to produce accurate bids. If they miss key or critical details, they can bid too low and lose money, or bid too high and lose the customer opportunity. The manual process is time consuming, labor intensive, and creates multiple opportunities for human error. To learn more about knowledge mining with Azure Search and see how Howden built their solution, check out the Microsoft Mechanics show linked below.