re:Invent 2024

Modernizing Online Educational Platforms on AWS: Enabling Reliable Student Experiences

Infrastructure & DevOps Modernization
Application Modernization

Learn how we helped an education technology company with a seamless transition to AWS, delivering high availability, disaster recovery, cost savings, compliance, and improved visibility for the customer's network infrastructure.

As educational institutions increasingly shift online, it has become crucial for them to adopt scalable, secure, and cost-effective technology solutions. By transitioning to AWS and embracing modern automated architectures on AWS, education providers, with the help of Caylent, can significantly enhance their infrastructure’s scalability, performance, and cost-efficiency. This transformation not only supports the growing demand for digital education but also ensures the protection and compliance of sensitive student data, ultimately empowering the education sector to innovate and adapt to evolving educational trends.

This makes adopting managed AWS services for databases, instance sizing, and re-factoring infrastructure critical for education technology companies to stay competitive and offer new features. Let’s take a look at a case study to see how a customer is taking advantage of AWS for their online educational platform.!

Global Online Education Platform Transforms Student Experiences by Migrating to AWS

The customer is a leading education technology company that provides online learning programs, assessments, and educational resources designed to support K-12 students, educators, and administrators. Their solutions focus on personalized learning, curriculum development, and academic support, helping schools and districts achieve student success in both traditional and virtual learning environments.

Caylent helped the customer with a seamless transition to AWS, delivering high availability, disaster recovery, cost savings, compliance, and improved visibility for the customer's network infrastructure. 

The customer was looking for help migrating from on-premises to the cloud to enhance resiliency. They faced several problems across networking, database availability, single AZ dependency, DDOS attacks, low observability which increased troubleshooting time. These impacted their student experience, which is critical for an online educational platform.

On the network side, they were looking for a solution to transition from F5 to AWS native networking services. They wanted to implement an automated solution that was secure, reliable, and optimized load balancing. 

On the database side, the customer was looking for a solution that would provide high availability and disaster recovery, while also adhering to compliance and regulations. Part of the problem to be solved involved customer’s use of Microsoft SQL Server. In addition to excessive licensing costs SQL Server is lacking many reliability features found in AWS RDS. These features include better scalability, automated maintenance and better DR support. 

The customer used a separate database per customer/tenant leading to each database server instance holding hundreds of databases. This has led to performance and reliability issues. Additionally, they have a Databricks Enterprise Data Lake which requires a database backup / restore to ingest data; a process that can result in up to two days of downtime. 

So the database side had three parts: automatic scaling of the database, slow backups and connection exhaustion. 

In addition, the customer’s application was a single monolithic .NET Framework application which was fragile and difficult to maintain or update. It was also difficult to deploy upgrades since it was an all-or-nothing situation. 

The single point of failure (AZ) and vulnerability to DDOS attack were both symptoms of running their application on a single machine on premise. 

Caylent’s Solution

For the network issues Caylent provided a solution that involved creating public and private load balancers, the VPC, and wrote custom rules to build a complex set of load balancers around the environment based on the customers' need to distribute load between target groups, and implementing Lambda functions to automate complex tasks. Caylent helped the customer achieve cost savings by right sizing instances and reducing licensing costs associated with F5. This is relevant to an edtech company because efficient and scalable cloud solutions are essential for managing high-traffic educational platforms, ensuring reliable access to resources, and optimizing costs for sustainable growth.

For the database resilience issues the new architecture includes Amazon RDS Proxy service for improving resource utilization on the PostgreSQL instances and reducing failover time, as an optional component. PostgreSQL uses a thread-per-connection parallelism model, and connection consumes significant server resources (especially RAM memory) even when idle. RDS Proxy also increases the application resilience to unpredictable usage peaks when the establishment of many simultaneous connections in a short period of time may overload the PostgreSQL server. Which is critical for maintaining smooth performance and minimizing downtime during peak usage periods on educational platforms, ensuring a reliable and seamless learning experience for users.

The proposed Architecture’s Synchronization Layer is based on Amazon Database Migration Service (DMS). DMS tasks can replicate a source database’s changes to a target database or data storage system in near real-time by monitoring the source database’s transaction logs for changes. Which is crucial for keeping educational content, user data, and analytics up-to-date across platforms, thereby enhancing the overall learning experience and data-driven decision-making. In this Layer, ELF’s Amazon Aurora Clusters will be the source, and an Amazon S3 bucket will be the target. DMS will replicate all changes to this bucket for consumption by the Databricks Delta Live Tables. 

The major difference between the current architecture is that the new architecture allows for incremental updates in near real-time. It greatly optimizes the ETL procedures, avoiding the expensive reimporting of all data. It also reduces the synchronization delay to minutes, from up to two days. 

Caylent's solution also allowed the customer to achieve high availability and disaster recovery capabilities. By implementing multiple auto scaling groups and a DR environment with a new load balancer and a separate lambda function, the customer was able to store data in DynamoDB and failover quickly in the event of a region outage and could quickly scale down that failover environment when it was no longer needed. The ability to respond to changing load and even systems failures has greatly improved the customer’s overall performance and resiliency. 

Caylent introduced TF and the ability to automate scaling groups and other key infrastructure functions to the team and enabled the customer to gain visibility and observability with AWS native tools to monitor the application and network performance, which helped to improve efficiency and reduce downtime. Overall, the Caylent solution allowed the customer to transition to a cloud-native environment and achieve a more cost-effective, scalable, and reliable network infrastructure. 

On the application side, the monolithic application was converted to a set of Linux based microservices running via Fargate on Amazon ECS. This moves container resiliency to the AWS side of the shared responsibility model. In addition, due to modernizing the application to be a set of micro services they are now able to deploy application updates far more efficiently and often and with much lower risk to downtime due to regressions.

For the observability issues they used a combination of Cloudwatch and OpenSearch. While exact numbers are not available the team has described this change as making troubleshooting significantly easier and faster. 

The new architecture is cost-effective, more scalable, and highly resilient. It can cost up to 82% less than an equivalent SQL Server solution with AWS licensing, requires less effort to operate, and fixes or alleviates all reported pain points. It’s a solid foundation for the ELF Platform’s application evolution to a highly scalable solution based on a microservices architecture. This report also includes guidance on evolution opportunities for further optimizing the performance, cost-effectiveness, resilience, and scalability of the ELF Platform’s data tier using Amazon’s purpose-driven data services. 

Infrastructure & DevOps Modernization
Application Modernization
Brian Tarbox

Brian Tarbox

Brian is an AWS Community Hero, Alexa Champion, runs the Boston AWS User Group, has ten US patents and a bunch of certifications. He's also part of the New Voices mentorship program where Heros teach traditionally underrepresented engineers how to give presentations. He is a private pilot, a rescue scuba diver and got his Masters in Cognitive Psychology working with bottlenosed dolphins.

View Brian's articles

Accelerate your cloud native journey

Leveraging our deep experience and patterns

Get in touch

Related Blog Posts

Speed Up SQL Database Migrations with GenAI

SQL Polyglot, our new groundbreaking AI-powered solution, significantly accelerates and simplifies complex database migrations, helping you minimize technical debt. Discover how it can reduce your migration time and costs by automating the translation of stored procedures.

Data Modernization & Analytics
Infrastructure & DevOps Modernization

Transforming Education on AWS: Improving Scalability & Innovation on the Cloud

Explore how we helped a education-focused technology company modernize their cloud infrastructure and data processes.

Infrastructure & DevOps Modernization

Moving from VMware to Amazon EC2

Learn how to migrate from VMware to Amazon EC2 and avoid VMware licensing and cost uncertainties while unlocking transformative cloud scalability and efficiency.

Migrations
Infrastructure & DevOps Modernization
Cost Optimization