RetailMeNot, part of Ziff Davis, Inc, makes everyday life more affordable for shoppers. They are the leading savings destination providing online and in-store coupons and cashback offers. RetailMeNot serves millions of monthly active users from their desktop, mobile web, native (iOS & Android) apps, and browser extension (Deal Finder™) experiences.
In 2019, RetailMeNot began a project to modernize its technology stack to be better prepared to serve future experiences and grow company revenue. Kartik Kumar Gujarati is a Senior Software Engineer at RetailMeNot, and he describes this evolution on RMN’s engineering blog:
“For the last 10 years, RetailMeNot’s engineering teams have built several highly performant, scalable, and efficient systems to bring savings data to our users through our experiences. And like many companies, RetailMeNot used REST APIs to serve the data. However, as the company and the systems grew, we started to experience problems like versioning, over-fetching, and under-fetching with REST APIs. Ultimately, this translated to performance limitations.”
RetailMeNot began with a monolithic GraphQL API as a solution to power their web and native app experiences, along with a new browser extension, Deal Finder™. They had a single team responsible for governance and standards for this graph. Client teams were given access to contribute to the monolithic graph, and the central graph team would review their contributions.
However, over time the monolithic graph began to be a bottleneck for RetailMeNot. The API team found themselves getting overwhelmed with all of the contributions. Hannah Shin, a Senior Software Engineer on the API team, describes the challenge,
“My team was in charge of maintaining that monolith. But we had many different client teams working on it. We started to become a bottleneck trying to coordinate different release cadences, conflicting features, and the desire to ensure that features were tested properly.”
Hannah’s team was also responsible for maintaining core data sources and their event-driven architecture. However, her team was bogged down with code reviews.
“Our team of three engineers spent around 75% of our time reviewing code changes to our GraphQL monolith, which left us very little time to innovate on our backend platform.”
RetailMeNot realized their monolith was no longer scaling, and they wanted a solution that could work well for the growing number of teams building on top of their graph. They chose Apollo Federation because it allowed them to empower each subgraph team to build and maintain their portion of the unified supergraph schema. As Hannah puts it,
“We wanted a way for the teams to not be tightly coupled. Having maintained our monolith, we saw so many inconsistencies and inefficiencies in our data structures. For example, we had a web offer card, then an app offer card, and another type of offer card. We believed that having the shared graph and consolidated ownership of shared types, would help us as an organization better understand and model our data.”
The process to migrate from their monolith was incremental. Kartik describes this process in depth in his blog post:
“Here are the incremental steps that we took for this migration:
After adopting Apollo Federation and Apollo Studio, RetailMeNot began to see immediate benefits. By automating their schema reviews and deployments with Apollo Studio, the API team at RetailMeNot no longer spent the majority of their time reviewing code.
Hannah Shin said, “After adopting managed federation and schema checks, we went from three engineers spending 75% of their time reviewing code to less than 10%.”
Their platform team could focus on innovating. Instead of doing code reviews, they focused on establishing best practices for contributing to the graph. They began regular education sessions for developers, engineering leaders, and the product team on GraphQL benefits. Over time, the discussions pivoted to focus on improving their supergraph. They invested more in observability along with templatizing experiences and content distribution.
“It’s been over a year since we’ve had any breaking changes. Prior to adopting Apollo, we had breaking changes as frequently as every month. We once took down our mobile home page for six hours.”
Senior Software Engineer, RetailMeNot
RetailMeNot has also seen a significant increase in reliability after migrating to their supergraph. Hannah says, “It’s been over a year since we’ve had any breaking changes. Prior to adopting Apollo, we had breaking changes as frequently as every month. We once took down our mobile home page for six hours.”
“Working in monolith means that you have to be very careful about your changes, moving to a subgraph architecture allows you to move much faster. We are able to get features out of the door 40% faster since we migrated to Apollo Federation.”
Kartik Kumar Gujarati
Senior Software Engineer at RetailMeNot
The days of painful rollbacks and war rooms have been replaced with much more confidence in their GraphQL release process. As a result, the team saw a significant improvement in developer velocity from their GraphQL monolith to their supergraph. Kartik says, “Working in monolith means that you have to be very careful about your changes, moving to a subgraph architecture allows you to move much faster. We are able to get features out of the door 40% faster since we migrated to Apollo Federation.”
Adopting a supergraph has empowered RetailMeNot to continuously onboard new teams and services. RetailMeNot plans to innovate by continuing to modularize its architecture. Currently, they are working on building out a templating system that is integrated into their content management system. Creating new pages typically took their operations team up to 1 month. It required writing custom feature code to be replaced with a template-based approach. Soon their operations team will be able to create new pages and experiences self-service without having to request new capabilities. Longer-term, this will empower them to change the template and have it propagate to all of their different use cases and make experimentation much more effortless.
As the RetailMeNot engineering team continues to scale their supergraph, their focus is on helping educate and onboard new teams. Summing up the benefits of moving to the supergraph, Karthik calls out the following benefits in his blog post:
Want to learn more about how RetailMeNot made the switch to the supergraph? Watch this webinar discussing their journey.
RetailMeNot, part of Ziff Davis, Inc, makes everyday life more affordable for shoppers by providing online and in-store coupons and cashback offers.
RetailMeNot’s monolithic graph was causing production outages and resulting in slow product velocity.
RetailMeNot adopted a federated supergraph with Apollo Studio that allowed them to iterate faster, deliver more consistent experiences, and collaborate more effectively.