Well managed applications begin with well-architected deployments. In this blog, Ganapathy Pullera, Sr. Cloud Engineer at MontyCloud, shares his first-hand experience on how a luxury jewelry retailer brought their deployment times to under 1 hour and run their Amazon ECS application with just 2 cloud engineers. MontyCloud DAY2™ Blueprints Library is built on blueprints that Gana built for this retailer and much more. Give it a try and please leave your feedback.
– Sabrinath S. Rao
Recently, I led the MontyCloud team that worked with a luxury jewelry retailer to streamline their Enterprise Data Hub (EDH) in Amazon Web Services. I want to share the case because there’s a lot to learn for the retail sector and really for cloud engineering teams in any industry. The customer’s policies do not allow me to disclose their name.
The customer uses EDH to centralize data from their point of sale systems at their flagship stores worldwide and analyze the data to generate actionable business insights. The reports they generate support critical business decisions including inventory, pricing, and promotions. It runs natively on AWS, bringing together over 10 AWS services.
The customer’s IT department initially built EDH, manually deploying each and integrating AWS service. As EDH organically grew, it became increasingly difficult to maintain. Even the most common of changes such as scaling the EMR clusters was a multi-team, multi-week effort and was error-prone.
In this blog, I will outline how we enabled this customer to reduce their change request response from 40 hours to less than one hour for routine updates and from 4 months to less than 2 weeks for more complex changes with Infrastructure as Code Templates using AWS CloudFormation (CFN) and no-code CloudOps.
Manually deployed environment was error prone and expensive to maintain
The customer manually built an EDH platform deploying and integrating several AWS services including:
As their business needs changed, they had to make changes. However, any change required code changes and integration testing across multiple AWS services. Every change and every report request was a ticket submitted to a 15 person team. This team had to instrument the changes manually across multiple AWS services, co-ordinate the changes with the individual service teams, and run integration testing. Smaller updates and reports tool as much as 40 hours while more complex changes such as adding new parameters or scaling the ERM cluster took as much as four months.
Even as the customer moved to AWS for its rich set of services and elasticity, they were unable to drive an agile system, and their costs were rising exponentially because they needed to grow the EDH team to keep it running. EDH operations were error-prone, inefficient and expensive.
Well-Architected Modular Design with Infrastructure as Code
My team worked with AWS and the customer to change every part of EDH into Infrastructure as Code using AWS CloudFormation(CFN).The following diagram details the interoperability of the system that we automated to one-click.
First step: Consistent and compliant deployments
First, we developed naming conventions and tagging to appropriately identify the resources. Next, we developed CFN templates using AWS Well-Architected standards to independently deploy each resource. Finally, we used pseudo parameters are being used such as
AWS specific parameter types such as AWS::EC2::VPC::Id,
AWS::EC2::KeyPair::KeyName, List and
List. < AWS::EC2::SecurityGroup::Id >.
With this approach, the customer has the flexibility to let individual AWS Service operators to make changes within well-defined boundaries set by the EDH architects. Most of the changes involved updating the parameters.
Consistency matters for even the simplest of Networks
We developed a separate CloudFormation template to set up the entire networking stack including peering across different accounts where the shared services are running. We also set up VPC connectivity with an on-premises environment using Customer Gateway ID.
Any system with multiple builders and users needs robust security
This entire solution is supported by IAM for restricting access and RBAC through a user interface as opposed to editing JSON files. We used AWS Key Management Service (KMS) for encryption at rest for data in EBS volumes, RDS instances, EMR clusters, and S3 buckets. Finally we implemented uses AWS Config to keep track of changes and to send notifications.
Maturing to a Well-Managed Application
Once we had automated deployment, next we had to tackle on-going management and maintenance.
As the resources are deployed, we also set up AWS CloudTrails monitoring for the deployed resources through the CFN templates.
On-Going No-Code Maintenance
Since we built the deployment templates with input parameters, the required changes are minimal. Most requests are handled by passing the required parameters by the requester into the template. Finally, if any changes are required to the underlying resources, the changes are made to the CFN template. This micro-service like no-code approach significantly reduces the errors.
This solution is now a part of a CI/CD process. All the CloudFormation templates are stored in a version-controlled repository in Azure DevOps. Any change is accompanied by a build job to trigger and deploy or update the solution Stack.
As a Well-Managed Application, EDH is now more efficient and costs significant lesser to run
By implementing the above solution the customer’s EDH platform is now more agile, robust, and cost-effective.
- The deployment is now consistent across releases through approved well-architected blueprints.
- Maintenance and updates are now faster and less error-prone because of modular architecture, with individual deployment templates for each resource.
- Customer now do one-click deployments through a well understood CI/CD pipeline.
With this solution, now change requests are handled in less than an hour as opposed to over 40 hours spread over two weeks or more in the past. Since each resource is independent, the templates can be managed independently by the resource teams and tested frequently. The net result being, now EDH is managed by a team of 2-3 as opposed to a team of 15 and growing.
Bringing the same No-Code Deployment and CloudOps excellence to you
We worked with the customer for 3 months to develop these best practices and make EDH a well-managed application. We have packaged these blueprints and many others into individual deployment applications, developed to well-architected standards, and make them available as part of the MontyCloud DAY2™ Blueprints Library. The image below shows one of the extensible blueprints that help you deploy a highly available, fault-tolerant ElasticSearch cluster in just a few clicks and immediately start monitoring it.
We replaced NAT Gateways with VPC endpoints which uses AWS PrivateLink. Their networking costs are now 75% lower. Also, now the environment is more secure as all communication is over a private encrypted connection. The users get better performance because there is no internet gateway, NAT device, a VPN connection or a AWS Direct Connect connection in the code path. Furthermore, because we use Private DNS, the traffic is automatically routed through the VPC endpoints without any application changes. By implementing these best-practices networking cost is reduced by 75%.
MontyCloud helped reduce the AWS bills by $300,000 annually
The online pharmacy reduced their annual AWS bill by $300,000 through cost consolidation and better insights into their applications. In the process, they enable compliant deployments without compromising agility through well-architected infrastructure as code blueprints. They now also run more efficient and secure operations. The online pharmacy estimates their aggregate savings in people, process and time to exceed $500,000. They anticipate using these savings to further automate their environment through no-code DevOps.
Bringing the same No-Code Deployment and CloudOps excellence to you
This online pharmacy is one of our earliest customers. The partnership with them helps inform the capabilities we codify and bring to you in MontyCloud DAY2™. For example, the DAY2™ VPC Blueprint showcased below was influenced by this customer. Now with a few simple clicks and a few parameters you can deploy your own VPC Endpoints, instead of weeks if not months of coding and testing.
In addition to what we did with this customer, MontyCloud DAY2™ automatically discovers all deployed resources, helps you tag the resources, organize them by application, automatically monitors the resources and helps remediate in the application context. You can get started with just a few clicks.