Analyzing Real-Life Examples of Polyglot Database Architectures (And The Hidden Costs of a Monolithic Approach)

Polyglot persistence allows you to build more performant and scalable applications. Take a look at these real-life examples for inspiration.

Sharing is caring!

by Matias Emiliano Alvarez Duran


20 years ago, a relational database management system (RDBMS) was the go-to way for building applications. The problem: You had to scale up. 

Luckily for us, NoSQL databases came in to solve the problem. While these are less standardized, NoSQL databases are more scalable and have higher availability. 

However, SQL isn’t dead. We still prefer RDBMS for transactions, user validation, and business intelligence. So, in an attempt to mix these two, the polyglot database architecture model was born.

In this article, we analyze four real-life examples of applications with polyglot persistence. Plus, the hidden costs of sticking to a monolithic model. 

Looking to revamp your software architecture? Hire NaNLABS, your technical sidekick, to handle the migration with your team.

Real-life examples of applications with a polyglot database architecture

In a polyglot database architecture, you choose purpose-specific databases for each one of your services (in a microservices architecture). For example, Cassandra for transactional processing and Neo4J for product recommendations.

Let’s analyze the database architecture of these popular applications: 

1. Bank of America mobile app—Unified business services

Bank of America is one of the world’s biggest financial institutions with over 69 million customers. After four years and over 400,000 lines of code, Bank of America now offers all of its services in one app that combines the functionality of five previous platforms:

  1. Bank of America for banking

  2. Merrill Edge (formerly Merrill Lynch) for brokerage

  3. MyMerrill for investments

  4. Bank of America Private Bank for private banking

  5. Benefits OnLine for employees to access corporate benefits

The bank also developed and included “Erica,” an AI assistant. Erica helps users make transfers, pay outstanding bills, and request or activate new cards. 

The bank doesn’t publicly share its tech stack. However, we can all agree that this screams polyglot, and surely, a lot of Python. 

If we were Bank of America, we’d have broken down each application into microservices and determined which database to use for each function. For brevity, we’ll focus on the core services of each app. 

Since we don’t know exactly what Bank of America’s architecture looks like, we’ve taken a broad guess:

Bank of America Database Architecture Design Example

If we were to build an application similar to Bank of America’s mobile app, we’d probably choose three main databases. We’d use PostgreSQL for managing large transactions with high performance. This database is also highly secure, making it a good choice for banking processes. For example, to send and receive transfers or buy and sell stock.

To build something like Erica, we’d use Elasticsearch due to its speed, search functionality, and scalability. Lastly, for the brokerage and investment modules, we’d mostly use Neo4j for handling the relationships between available bonds or investment funds and potential investors.

2. Facebook—Multi-tenant application

Multi-tenant SaaS applications are usually built on top of a microservice architecture. This allows for better customization, ensures isolation, and is more flexible for integrations.

On the contrary, going for a monolithic architecture increases the single point of failure, i.e., an issue affecting one tenant could bring down the entire system. Plus, managing specific customizations within a monolith often becomes unmanageable.

Facebook is one of the clearest examples of a multi-tenant application. Initially, Facebook was built on top of a LAMP stack: Linux, Apache, MySQL, and PHP. However, when it became popular, it needed a more scalable architecture. That’s when it turned into a multi-tenant polyglot application.

Facebook is a very complex platform. It’s a social media that’s also a marketplace, game center, business engine, and community hosting site. 

Let’s simplify it into a couple of services for the sake of this analysis. Here’s a partial view of Facebook’s database architecture:

Facebook Simplified Database Architecture Design Example

  • User login. MySQL, as it was Facebook’s primary database. Over time, Facebook has enhanced MySQL's capabilities with custom modifications to improve scalability and performance.

  • Friend recommendations. TAO (The Associations and Objects). This is a custom geographically distributed data store. Facebook developed it specifically for managing the social graph and serving friend recommendations with low-latency reads and writes.

  • Messaging. MyRocks is Facebook’s open-source database project that integrates RocksDB as a MySQL storage engine. 

  • Storing pictures. Haystack is a custom-built photo storage system that efficiently stores and serves billions of photos. It’s designed to handle large-scale image storage with low latency and high reliability.

  • User interactions. Cassandra, thanks to it being a wide-column database that supports horizontal scalability, fault tolerance, and large amounts of data. On top of this, Facebook uses Ganglia as a monitoring tool to keep track of nodes and stay on top of potential failures.

  • Page analytics. Scuba is Facebook's in-memory data store for real-time analytics. It allows Facebook to analyze log data and monitor system performance, providing near-instantaneous insights and metrics related to user interactions and page analytics.

3. Netflix—Data streaming 

Netflix shared its architecture design a couple of years ago on Medium. There, Netflix data engineers, Xavier Amatriain (former employee) and Justin Basilico explain how the architecture works.  “Our algorithmic results can be computed either online in real-time, offline in batch, or nearline in between,” they say.

Netflix uses a hybrid approach by precomputing parts of results offline while handling context-sensitive parts online, specifically for the recommendations feature. This hybrid modeling where large-scale training is done offline, and lighter user-specific updates are done online, optimizes performance, responsiveness, and resource efficiency.

The architecture operates mostly on top of AWS (Amazon Web Services) and involves breaking down the system into subsystems for detailed exploration. Here’s a summary of Netflix’s database architecture: 

Netflix Simplified Database Architecture Design Example

  • Computing. AWS EC2 for scalable computing capacity for running multiple tasks and applications in the cloud.

  • Data storage. Amazon S3 for scalable object storage for large volumes of raw and processed data.

  • Data Processing. 

    • Spark for ETL (extract, transform, load) processes and accessing data within the Petabytes-scale data warehouse.

    • Presto/TrinoDB to enable querying and analysis of data within the data warehouse.

    • Druid for sub-second latency for certain queries.

  • Metadata.

    • Metacat, as an operational metadata store critical for big data processing and computing.

    • Netflix-wide schema registry that manages the lifecycle of schemas in different data stores.

  • Real-time data streaming. Kafka, Flink, and Mantis for supporting analytical, operational, and event-driven use cases.

  • Search and analytics. Elasticsearch for reverse search. This means searching for queries that match a document instead of the other way around. 

The hidden costs of building a platform on top of a monolithic architecture

Diagram explaining the key difference between a monolithic and a microservices architecture

While some people choose to build monolithic applications when developing an MVP or enterprise applications, there are hidden costs to this choice. Building a monolithic application makes it:

  • Harder to scale. Since they’re tightly coupled, it becomes more difficult to scale each part individually. Plus, scaling one of the components could cause potential bottlenecks. For instance, you might need more CPU power to process your increased storage room. 

  • More complex to develop certain features. Depending on your database choice, adding new features to your monolithic application could become a herculean task. Imagine adding a graph-based traversal of information on a SQL-like database.

  • Harder to write and edit the code. Coming up with workarounds to add new features or services into one monolith impacts your code directly. This could eventually affect your app’s performance. Also, if you don’t have the right process to handle this complexity, any changes will be expensive.

Downsides of polyglot database architectures

We prefer a polyglot approach over a monolith model built on top of RDBMS because it allows you to scale more easily and support your users better. However, this approach could impact your agility as:

  • Developers need to have specific skills. This could potentially complicate your hiring process.

  • You need to build CI/CD pipelines for each service. This can be time-consuming and complicated.

  • You must test automation for each database. This adds a step to data engineering development.

  • It installs an increased complexity of the architecture and operations. This could cause you to pay fees late or oversee certain database providers.

How we helped HyreCar move away from legacy architecture

Legacy architecture can cause systems to crash often as their user base grows. This was the case with HyreCar, a peer-to-peer car-sharing platform that connects car owners with rideshare drivers. 

This growing business struggled to scale at the same speed as its growth. Since HyreCar’s architecture didn’t support database scalability, the team reached out to us for help.

At first, the NaNLABS team successfully completed the migration from a monolithic application into a NodeJS cloud-based platform. This allowed HyreCar to grow its user base by 5x. We also introduced TypeScript and GraphQL into the tech stack.

In terms of data stores, the app was mostly built on top of relational databases. Since the database selection needs to be a very conscious process, we stuck to Amazon RDS for most of the services. However, we included AWS DynamoDB to handle one of the services: message car brokers.

We know this doesn’t look like a polyglot application at first sight. However, polyglot persistence doesn’t mean using ten different databases. Instead, it means to choose purpose-specific databases—even if that means using a NoSQL database for a single service. 

Also, we like to use this example because we believe it shows our team’s expertise. Every time they work as an augmented team, they analyze the trade-offs of potential technology before settling on a stack, while still following the client’s intentions. 

Here’s how HyreCar’s web app database architecture turned up:

HyreCar Database Architecture Design Example

With NaNLABS on board, HyreCar migrated to a microservice architecture. This resulted in improved velocity and business growth without impacting day-to-day business capabilities.

Need to revamp your software architecture?

In summary, what can we learn from the database architectures of top-performing companies?

As seen, big enterprises like Bank of America, Facebook, and Netflix have adopted a polyglot architecture to be more scalable, resilient, and performant.

While going for a polyglot database architecture can cause you to lose agility at the beginning, this approach future-proofs your software. And, when it comes to database selection, you need to determine which trade-offs make sense for your business. 

Taking the mentioned companies as examples, we can agree that designing applications with a polyglot database architecture is a data engineering best practice. The more flexible your design is, the more resilient your platform will be in the future. 

If your architecture doesn’t support your user base and you need an extra pair of hands to migrate to a polyglot design, augment your team with NaNLABS. 

We care deeply about each of our clients’ success. That’s why we come up with ideas, flag potential issues, and work alongside your team to solve your architectural problems. 

Sounds good?

More articles to read

Previous blog post

Web Technologies


How to Choose the Best Data Engineering Company in 2024

Read the complete article

Next blog post

Web Technologies


8 Best Practices for Building a Scalable Infrastructure

Read the complete article