Cutting costs while using Microsoft Azure

In March 2020, I started using Cosmos DB to store COVID-19 data from the Johns Hopkins University CSSE dataset. Initially, I created 5 containers in a database. I treated the containers as though they were SQL tables. I built the website backend using .Net Core 3 along with Microsoft’s Azure SQL API. For the front-end, I used Angular 8 with PrimeNG components, Tweetinvi for pulling data from Twitter, and NewsAPI for pulling news articles.

At the time I created the Cosmos DB containers, I didn’t fully research and therefore not understand exactly how Azure billing works for Cosmos DB. For the first 2 months, the vast majority of my $150 Azure credit was going toward Cosmos DB charges. At first I thought this was due to the size of the database, since I was storing every daily report going back to late January. Since I wasn’t using data beyond the current day, I deleted all data but the most current day to reduce the container size. I expected to see a decrease in the charges, but that didn’t happen. Somewhere in mid-April, the dataset changed and one of the containers I created was no longer necessary. I deleted it and noticed the estimated costs decreased 20%, from approximately $130 to just under $100.

One of the main advantages of using a NoSQL database is that many different types of documents can be stored in the same container, something that can’t be done in a SQL table. While I knew this, speed was more important, which is why I chose to have each document in its own container. What I didn’t know was how much more expensive that would turn out to be. I learned my lesson. I changed my data loading application to load all data into a single container, with each document identified by a record type. Making this modification dropped my estimated costs from just under $100 to $23, a savings of approximately $100 from my original setup.

While I was refactoring the controller code for the change in how data was being retrieved to be displayed, I implemented memory caching using IMemoryCache (from the Microsoft.Extensions.Caching.Memory namespace). This would serve to minimize the request units (RUs) consumed, as I constrained myself to 400RUs. (See this article from Microsoft about RUs.) Since the data is updated once daily, caching the data would minimize calls to the SQL API, which would serve to cap throughput and increase the speed at which the website loads.

When moving to Azure, or any other cloud service provider, make sure you understand how your architectural design will impact costs so that you don’t end up with a surprise amount on the next invoice.

About the Author

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You may also like these