Fauna with Brecht De Rooms

Episode Description

A deep conversation about database fundamentals, the trade-offs between SQL and NoSQL, and how Fauna aims to combine scalability with developer experience.

Episode Summary

This episode features Brecht De Rooms, a developer advocate at Fauna, joining hosts Christopher Burns and Anthony Campolo for a wide-ranging discussion about databases. The conversation begins with the basics—what a database actually is and how SQL and NoSQL differ—before moving into the nuances of relational versus document models and why the boundaries between database types are increasingly blurring. Brecht explains how Fauna occupies a unique space by offering document storage with relational capabilities, strong consistency, and automatic scaling, all accessible as an API rather than a traditional connection-based system. A recurring theme is the tension between developer experience and operational reliability: Christopher admits he prioritizes speed and simplicity as a solo developer, while Brecht argues that choosing the wrong database early can lead to painful consequences at scale. The group explores managed versus self-hosted databases, pricing models, and the real cost of maintaining infrastructure. The episode closes with a look at Fauna's roadmap—including streaming, improved GraphQL integration, and plans for higher-level query abstractions—and a discussion of how FQL compares to tools like Prisma as a way to interact with data.

Chapters

00:00:00 - What Is Fauna and What Is a Database?

The episode opens with introductions before Brecht explains what Fauna is: a distributed, multi-region cloud database that blends SQL and NoSQL concepts, offering document storage alongside relational capabilities and strong consistency. He describes it as something that tries to be an API first, prioritizing developer experience while maintaining the reliability guarantees developers expect from traditional relational databases.

Christopher then poses a fundamental question—how would you define a database?—which leads to a discussion about abstraction. Christopher shares his preference for treating databases as simple data-in, data-out systems, while Brecht pushes back gently, noting that what you don't care about in a database can come back to bite you. The conversation naturally flows into ORMs and the mismatch between object traversal and SQL joins.

00:05:07 - Developer Experience vs. Reliability Trade-offs

Christopher raises the question of what matters most when choosing a database as a solo developer: speed of development or long-term performance and reliability. Brecht argues this isn't just a solo-developer concern—teams want to move fast too—and explains how serverless architectures benefit from databases that function as APIs, eliminating connection-limit problems. He notes that database categories are converging, with Postgres adding JSON support and NoSQL databases adopting consistency features.

The group then tackles the SQL versus NoSQL terminology debate. Anthony clarifies the distinction between SQL as a query language and relational as a data-modeling concept, while Brecht shares Martin Fowler's anecdote about "NoSQL" originating as nothing more than a Twitter hashtag for a meetup. Christopher offers his own mental model—SQL is like Excel, NoSQL is like JSON—and Brecht refines this analogy by explaining how document databases with relations effectively give developers the best of both worlds.

00:12:08 - Why Your Database Choice Matters Early

Christopher simplifies the concept of a database down to files and an API, and Brecht validates this while cautioning that beneath the surface, the differences between databases are significant. He highlights the risks that emerge at scale—concurrent writes, indexing failures, and replication challenges—and recommends developers consult resources like Jepsen, which stress-tests databases for reliability, before committing to a choice.

Brecht shares an analogy from his blog series: everything seems fine at first, but when an application suddenly attracts heavy traffic, hidden weaknesses in the database can bring everything crashing down. He explains Fauna's philosophy of enforcing good practices by default, such as requiring indexes before queries, ensuring developers can't accidentally write inefficient code. Anthony adds that the intimidation many developers feel around databases is part of the problem, and that even the most basic expectation—not losing data—isn't met by every database.

00:18:16 - Migration, Prisma, and Moving Between Databases

The conversation shifts to database migration. Christopher asks whether it's ever too late to switch databases, and Brecht confirms that migration is always painful, even between two SQL databases. Christopher shares his own experience migrating from MongoDB to Postgres when Prisma 2 dropped MongoDB support, describing a scrappy ETL process involving export scripts and seed files. Brecht notes this kind of approach is common and rarely elegant.

The group discusses Prisma 2's growing popularity and its role as a universal database abstraction layer. Christopher raises a thought-provoking question: if Prisma eventually supports many databases behind one API, can developers be trusted to choose the right one? Brecht argues this would actually improve decision-making by removing the query API as a differentiator, forcing developers to evaluate databases on their actual guarantees—consistency, scalability, and reliability—rather than surface-level ergonomics.

00:23:00 - Hosting Models and the True Cost of Databases

Christopher opens a discussion about self-hosted versus managed versus fully-managed database services. Brecht outlines four tiers of hosting, from running a machine in your basement to Fauna's fully managed multi-tenant model where no specific resources are provisioned per user. He explains that Fauna's founders, who came from Twitter, built the service precisely because they experienced firsthand how painful it is to manage distributed databases at scale.

The pricing conversation follows naturally. Brecht explains that Fauna's multi-tenant architecture enables competitive pricing and describes the free tier as generous for development but not intended for production. Christopher compares his own Postgres costs and Brecht points out the hidden expenses: no replication, no multi-region failover, and no automatic scaling. The discussion underscores the broader theme that the cheapest-looking option often carries the highest risk when things go wrong.

00:36:58 - Real-World Consequences and Fauna's Guarantees

Christopher admits the conversation is making him nervous about his own database choices, especially given that his application operates in fintech. Brecht explains that financial applications are exactly the kind of domain where Fauna's strong consistency guarantees matter most, and that running a single Postgres instance without replication introduces real risk if that machine goes down. He frames this as a paradox: optimizing for developer experience now can set you up for the worst experience later.

Anthony pivots to new features Fauna has been rolling out, including team management, third-party authentication via JWT tokens from providers like Auth0, and push-based streaming built on HTTP/2. Brecht explains the difference between push-based and pull-based streaming, noting that document-level streaming is available now while collection-level streaming is on the roadmap. These features reflect Fauna's broader goal of reducing the operational burden on developers.

00:43:41 - FQL, GraphQL, and Fauna's Developer Roadmap

Anthony and Brecht discuss the relationship between FQL and GraphQL—two very different query paradigms that Fauna supports side by side. Brecht explains that GraphQL queries are translated into FQL behind the scenes, preserving consistency guarantees, and envisions a future where developers can seamlessly embed custom FQL logic within GraphQL queries for maximum flexibility and power.

Christopher asks how developers actually interact with Fauna in JavaScript, and Anthony clarifies that while there's a JavaScript driver, you're still writing FQL through functional composition rather than the chained method style common in tools like Prisma. Brecht acknowledges that many developers prefer the more familiar chaining pattern and hints that Fauna may move in that direction. The episode wraps with Brecht mentioning the potential for a Prisma driver once a Rust driver is ready, which visibly excites Christopher as a possible path to adopting Fauna for his own production application.

Transcript

00:00:00 - Christopher Burns

Just so I know, how do you pronounce your name?

00:00:03 - Brecht De Rooms

My name is Brecht in my language, but it's terribly hard in another language. My girlfriend is French, and even for her parents it was too complicated.

00:00:25 - Christopher Burns

Welcome to the show. You're a developer advocate at Fauna, so why don't you start by telling us a little bit about Fauna and the role you do there?

00:00:35 - Brecht De Rooms

Fauna is a cloud database, a distributed multi-region cloud database. It tries to be similar to Spanner or Firestore, yet with a quite different focus. It focuses on being an API and offering the best developer experience possible. It's something in between SQL and NoSQL—essentially scalability with developer experience features, and at the same time you get to keep the relations and strong consistency that you like from, say, a Postgres database.

It's also multi-modal. It's essentially a document database, yet you get to keep the relations. It's a bit confusing for many people, but it borrows from a lot of different kinds of databases and brings all these good things together in one.

00:01:30 - Christopher Burns

Without talking about Fauna first, how would you describe a database?

00:01:35 - Brecht De Rooms

That's a tricky question because we say the database is becoming an API. I would say a database is something where you want to store your application data, and it should be generic enough that you can decide how that application model looks.

That's where I think it differentiates from a CMS, where a CMS is typically opinionated and provides certain data types. There might be custom data types, but a database is good at storing all those very small details and gluing them together so you can query them as flexibly as possible.

00:02:21 - Christopher Burns

The way I look at a database—I'm a generalist. I would like my database to be as abstract as possible. I just feed data to it and read data out of it. I don't care about everything else, and I know that's sacrilege to some people, but for me personally, you have to generalize. That's where, if you're looking at a Postgres database, you bring in an ORM, an object...

00:02:48 - Brecht De Rooms

Relational model.

00:02:50 - Anthony Campolo

Object relational mapper.

00:02:52 - Brecht De Rooms

Yeah, indeed. It's a mapper.

00:02:54 - Anthony Campolo

It maps your objects to your relations.

00:02:56 - Christopher Burns

Yeah, exactly. When we talk about your typical MySQL database, you normally have an ORM to make it easy to talk to and receive information from the database. Fauna is something slightly different. Like you said, it's more of an API, from what I understand, and I admit I keep getting told to look at it. But I've not got round to it yet. It's more of an API you can just send and receive data with.

00:03:27 - Brecht De Rooms

I'd actually like to hook into a few things you said. You want to store data and get data out of it, and you said, "I don't care about everything else." I'm super interested in what parts you don't care about, because there are so many things that might go wrong. Maybe that API is just storing your data in a CSV, which will obviously go wrong at a certain point. I'm wondering what you don't care about and what you do care about.

The other thing is, you mentioned typically using an ORM, and that's interesting because a lot of people actually run into problems using ORMs. The way an object works is you have an object linked to multiple objects. When you get this data, you actually traverse a tree, which is completely different from how SQL joins work. They just join sets. They don't traverse the tree. The tree traversal problem is extremely difficult to solve in a regular SQL database, which is why they invented graph databases.

[00:04:26] The cool thing is, Fauna is actually super good at that problem as well. It borrows a bit from graph databases too. If you do a join in Fauna, you get an object, then start mapping over all references that link to that object, get those, then start mapping over these, get those—which is very similar to GraphQL.

It might be a good idea to abstract away from the database because essentially what we want is productivity. It's too complex and we just want to go fast. But maybe an ORM is not the perfect tool to do that, because first the database has to map onto that problem.

00:05:07 - Christopher Burns

To hook into what you said about what I do or don't care about—no matter what you pick, it does store the data at the end of the day. So why focus on speed, complexity, MySQL or NoSQL? I'm the only developer in my startup at this stage. When you're one person, you start to think, what can I do by myself? What can I let a computer do for me? Prisma 2 is a very big thing that's building, but it only works with MySQL databases. Or is it just SQL databases?

00:05:46 - Anthony Campolo

SQL databases: MySQL, PostgreSQL, SQLite, and then stuff like Aurora, which is like a flavor of MySQL.

00:05:54 - Christopher Burns

When we talk about developing, what comes first? Is it developer experience or is it how well it performs? The bigger the database gets, how easy it is to back up? How easy is it to export the data when you are just one person and you need to move fast? Most FSJam projects already include Prisma 2. You tend not to look anywhere else.

00:06:21 - Brecht De Rooms

There are multiple things in here. First, it's not only if you're working alone that you want to move fast. A lot of teams want to move fast as well, and I would honestly be stoked if there were a Prisma driver that works on top of Fauna.

I think Fauna would be an ideal match if you get the choice between "I run Prisma, so the way I interact with the database is the same," and then you choose between a scalable database that scales as you go and a database where you have to set up a lot of things and it doesn't scale. Then the choice is probably easy. And I think in the future we will see a Prisma driver.

What you said about developer experience—I think databases have been ignoring developer experience for a long, long time, and that has to change with how application architectures are evolving.

[00:07:17] For example, if you start building serverless applications, you might want to interact with these databases right from the serverless application without connection problems, because there's a limit on the number of connections. That's when we go back to the database as an API. Then you don't have that problem because we don't open a specific connection. You're accessing the database as if it's an API, so there's no limit on connections.

We're seeing that the difference between different types of databases is fading out. Postgres is starting to provide JSON types. Other databases that used to be purely NoSQL and were strong advocates of eventual consistency are now providing consistency features and transactions.

I think in the future a database will be something where you have the choice, and that's something you already see in Fauna. It's a document database. You can have nested documents in there. You can have arrays. But we don't tell you, "That's the way you have to do it," and force you to do complex workarounds for not having relations.

We also provide relations. If you want to use relations and normalize your data, great. If you want to put everything in a document because you're going to fetch it as a whole, you can do that too.

00:08:41 - Christopher Burns

For now it's strongly in the NoSQL.

00:08:45 - Brecht De Rooms

It depends on the definition, to be honest, because many people say NoSQL is basically no consistency, no relations, scalability, and schemaless. I think only the schemaless is true there. So it depends on who you ask.

00:09:01 - Anthony Campolo

We have to distinguish between what it means for something to be SQL and what it means for something to be relational. I think this is what most people confuse. SQL is a specific query language used to query data in a relational form. You can have relational data and query it in a different way. This is something commonly talked about with DynamoDB. DynamoDB is not a SQL database, but you can still model it in a relational way, because for something to be relational, it has to do with relational algebra. It's a mathematical constraint.

00:09:38 - Brecht De Rooms

It's actually funny because I recently saw a talk from Martin Fowler on exactly this. He said the NoSQL term came from a meeting about databases that were not SQL, and the only thing it meant at that point was a Twitter hashtag to promote that meeting. Afterward, people took that hashtag as the new breed of databases, which was not the intention. When he's asked, "Can you define NoSQL?" he says no, but we can say typically a database might have this or that. But for example, graph databases have relations. So it already doesn't make sense to say it's non-relational.

00:10:15 - Christopher Burns

I'm going to maybe regret this statement, but in my head, how I imagine SQL: SQL is like Excel, as you can have multiple tables that have a very defined structure. NoSQL is kind of like a JSON object. I know it's a lot more than that, but I don't get how referencing works and relations work between SQL and NoSQL. For example, do you duplicate the content or do you just point it in a different place?

00:10:50 - Brecht De Rooms

That depends on the database and what features it provides. You're right that a SQL database is more like a spreadsheet with multiple tables that have links between them. The other alternative would be JSONs, which don't have links to each other—you could call those document databases. It's a good analogy.

Now, put those JSON documents into Excel. Then you have a document database that has relations. It really depends on how you do it. In the beginning, a lot of document databases didn't provide relations because it actually defeats their purpose. They wanted to provide the document exactly in the format you're going to query it, which becomes a problem if you change that idea and suddenly want to query it differently.

If you do provide relations, then you get the best of both worlds and the option to choose.

[00:11:46] Am I going to store this document exactly the way the UI needs it, or am I going to normalize and link this data together? That's essentially a choice and a trade-off. Am I going to do compute at the moment I query, or am I going to give up my flexibility of storing or retrieving the object in a different manner?

00:12:08 - Christopher Burns

Databases are such a hard concept for some people to understand. I'm one of them. I admit my brain just doesn't think that way. But you could say you could build a database with an API that would basically make four files, each one being a different JSON array. You push to the API, the API goes, "I need to write that to file one." You call the API and it goes, "I need to go to file one, find the object, and return it." That's the core principles of a NoSQL database. Very oversimplified.

00:12:49 - Brecht De Rooms

If you make a slightly more advanced version of that simplified concept, there might be APIs out there that actually store your data that way and present themselves as a database. You're right that it's really hard to know what a database is doing behind the scenes and what the difference is between database A and database B.

Of course there are lots of optimizations you can do there. That makes the difference between whether you're going to fail when you write something, or in the long run when you have too many files and maybe your file system breaks or you have to incorporate multiple servers. How are you going to index everything? You have to cut up everything in small bits so you can index it.

I actually wrote a complete series about why it's important to choose the database at the beginning of your project and not just take any database. Because what you said a while back, you said, "All of these databases work. What does it matter which one I pick?"

[00:13:53] I actually heard exactly the same thing on a very famous podcast, Syntax, which you probably know. They had the same opinion. For me, I don't get that, of course, because I work for a database company. But I think the problem comes from the fact that when you actually run into these problems, you're already quite far in your application development. Typically that's when a lot of users are starting to use your application.

There are risks at the same time. In your simple example, when you just store things in files, what happens if multiple people write at the same time? What is it going to do if you index things, which is a separate file, and you write at the same time, and that index has to change as well? You have to deal with that, and how you deal with that matters.

[00:14:48] That determines the quality of your database. I would say if you're picking a database and you don't know a lot about databases, take a look at Jepsen, which is someone who breaks databases for a living, and look around for horror stories about databases.

Typically what happens—and I have an analogy in the blog series I write about this—is you start, everything goes fine. You have a lot of users, everything still goes fine. Suddenly you get a massive amount of attention, your application explodes, and then you have bottlenecks, errors, you don't know where they came from. At that point, what are you going to do with all these new users? Are you going to go through the extremely difficult exercise of keeping your application running while you change the database? At that point it's often too late.

That's why it's very important to know what your database is, how it behaves, and what guarantees it provides. Fauna takes a very strong stance there and says you shouldn't worry about consistency.

[00:15:50] So we provide the strongest consistency for you. You shouldn't worry about replicating your data, so we do that for you. We try to make it as easy as possible so you don't shoot yourself in the foot.

It's the same with indexes in Fauna. You require an index before you can query. Although people sometimes find that hard and not developer-friendly, you can't write an inefficient query by doing that. That's the idea behind Fauna: make sure you don't run into these problems in production.

00:16:20 - Anthony Campolo

When I was first learning to code, when people would talk about databases, it was always this huge, complicated thing that you shouldn't touch or mess with or even think about too much. "Oh, it's the database." I think that connotation is what leads to a lot of developers not feeling comfortable with databases. You're right that you have to get into the actual internals to understand how they work, because you're going to have different ways of querying it and different guarantees of how reliable it's going to be.

You mentioned Jepsen. I love the whole series of Jepsen posts. It's really fascinating because he shows that we expect all of our databases to save our data. You'd think that would be the first thing your database should do for you, and just by setting that criterion, not every database meets it. I think about it as data loss being the first level, and then the next level being, how do you query it? How do you get the data out of it?

[00:17:20] And then the bigger level being, how nice is it to actually work with the database, the whole developer experience thing.

00:17:26 - Brecht De Rooms

The analogy I used was a real story. I lent my bike to my best friend, but I'd worked on the bike just before I lent it. My friend tried to lift the wheel and I hadn't bolted it properly. The wheel came out and he fell extremely hard. If I'd had that bike tested by a technician, that wouldn't have happened. If I knew something about bikes and tested it, that wouldn't have happened.

The worst thing is, if you compare it to the bike, you have hundreds of clients relying on you riding your bike as it should be. So if you fail with your database because you don't realize there's a problem with it and you fail at a moment when it's too late, all these clients go down with you.

00:18:16 - Christopher Burns

A lot of these problems happen when you start using it and start having more customers, and it grows. You can make a decision at the start, but is it the right decision later? One of the questions I want to ask, and I've had some experience with this lately myself: is it too late? Could you easily move from something like Postgres to Fauna?

00:18:44 - Brecht De Rooms

I think changing a database in an application is always, always hard. I don't think it's ever easy. If you had used Prisma, it might have been easier. But even changing between two databases, like going from Postgres to MySQL—I worked for a company called Chameleon, which is an analytics company with plugins on different databases.

Writing a plugin for two different SQL databases, there are often caveats you have to take into account. It's never easy. Their guarantees are often not the same. They work slightly differently, just differently enough that you have to change quite a few things.

Changing from a SQL database to a NoSQL database will obviously be a bit more work. But if you choose a database that also provides relations and provides the guarantees that you actually need from that SQL database, then you're going to have a much less hard time going to a scalable database like Fauna than going to something else that doesn't provide these same guarantees.

00:19:52 - Christopher Burns

The reason I bring up importing and exporting data—I built an MVP of Ever Fund and that used Prisma 1 backed by MongoDB. Everyone just says MongoDB is really good, so I picked that. Then when Prisma 2 came out, they were like, "Yeah, we've got no MongoDB support." So I was like, "I guess I'll use Postgres then, because Prisma is pretty good recently." We were like, "We need to move all that data to the new database. This is going to be hard."

This is probably not the right way, but the fastest way? I exported the whole collection from Mongo, fed it all into a CLI tool that sorted it, merged all the things into objects, and then wrote a massive seed file that just spat it all into the new database through Prisma 2. Probably not the best way to do it, but it worked.

00:21:02 - Brecht De Rooms

I've seen a lot of ETL scripts and they're typically like that. It's never the best way.

00:21:07 - Christopher Burns

I'm happy to say I do not understand databases, but they're so important. Prisma 2 is gaining so much popularity because they're making it so easy to spin up a database, put the correct syntax and structure together, and then read and write data.

00:21:28 - Brecht De Rooms

And not only that—as a framework developer, as Anthony can probably say as well, having Prisma, having one API and suddenly supporting many databases is quite awesome. That's like cutting the work by a factor of ten and not having to maintain all these different database connectors.

00:21:50 - Christopher Burns

Here's the problem I see with Prisma 2. Let's take the perfect world. Fauna gets added, MongoDB gets added, CockroachDB, whoever. Then you're going to have a table saying which one to pick. Can the developer be trusted to make the right choice for the right use case?

00:22:12 - Brecht De Rooms

How is it different from what's already happening now? Imagine the only difference is that all of them have the same query API. Right now they don't, and they still have to pick between database A and database B. I think a big factor of what developers pick is the query API. I would argue that's the wrong motivation to choose a database, because the query API says nothing about the reliability of the database.

What I think is that having Prisma in front of them, what's left to choose is basically: what guarantees do they provide? Do they scale? Do you have built-in security? Are they consistent? If you write something, are you going to read the same thing you just wrote? I would argue that's a good motivation to choose a database, not a query API.

00:23:00 - Christopher Burns

My next big question: devil's advocate here. Hosted, managed, self-hosted, and self-managed.

00:23:09 - Brecht De Rooms

I think Fauna is a service. The reason we chose completely managed is because people are just not good at keeping a scalable, distributed, multi-region database online. That's the service we're providing to our customers. We guarantee that your service will stay online and that we will scale it for you.

In that sense, it doesn't really make sense to manage such a service yourself. We do provide a Docker image, and you could use that for development purposes, but you shouldn't use that for production because you can't put multiple nodes in there.

A lot of users ask, "Can I run this myself? Because one day maybe Fauna might go away, and then I can continue running it." But I would argue that all the advantages you had with Fauna just go away, and it will be extremely difficult to manage that yourself. I think many people still see the benefit of managing things themselves, but that will go away eventually.

[00:24:19] I don't think these are problems we want to solve over and over again. I've been in this situation where I was helping to keep up different kinds of databases that many clients use. It's something that takes a lot of time and is extremely stressful and difficult because you need deep knowledge of your database.

If you're going to run Cassandra yourself and make sure it always stays up, make sure you understand every detail of how Cassandra works, because at a certain point you're going to have a problem. That is exactly why the people who started Fauna, Evan and Matt, who originally came from Twitter, started it—at least in part because they had personal experience doing that.

They thought at a certain point these things would get better. It would get either easier to manage or someone would manage it for us. They hoped we would get consistent databases that offer relations.

[00:25:22] That also didn't happen. At least the scalable, distributed kind didn't provide these things. So they decided to build their own. They're providing that service to you so you don't have to do it yourself.

00:25:35 - Christopher Burns

Actually, my question was wrong. There's technically three tiers. There's self-hosted, hosted, and hosting as a service. Fauna would fall into hosting as a service.

00:25:48 - Brecht De Rooms

Exactly.

00:25:49 - Christopher Burns

Hosted would be I'm going to run an image on DigitalOcean, and then self-hosted is the Wild West.

00:25:59 - Brecht De Rooms

I would actually say there are four. Self-hosted is like you have your machine in your basement and you host your database on that. I would argue almost nobody still does that. Then you have hosted, which is what you described, but then there's another kind of hosted where, say, DynamoDB will spin up resources specifically for you at the moment you create something.

A few databases do that. There are databases as a service, but they do spin up specific resources and you can see specific things about that. Some databases work more like Heroku would work. You see that there's one instance of something that's automatically spawned for you, and you can say, "I want three instances." Sometimes it translates into read operations where you say, "I want 1,000 read operations capability," or something like that.

[00:26:58] Fauna is the furthest you can go in that direction in the sense that we don't limit you. We don't ask you to specify, "I want this many servers. I want this many read operations." It's just a service and you start working with it. Create a database and it starts scaling for you. We don't even create specific resources behind your back when you sign up for cloud. It's a multi-tenant system from the ground up, which also allows us to use resources more efficiently.

00:27:31 - Christopher Burns

The biggest reason to self-host, host, or go higher is probably money. So where does Fauna come in?

00:27:40 - Brecht De Rooms

How does Fauna what?

00:27:42 - Christopher Burns

Free tier to get started?

00:27:44 - Brecht De Rooms

I didn't understand the question at all. Sorry.

00:27:48 - Anthony Campolo

Because it's not really a question he's asking. Is Fauna expensive or not? Which, of course, you're going to say, "Yeah, Fauna is ridiculously expensive and you should never spend money on it."

00:27:57 - Christopher Burns

I'll rephrase that. Sorry. Self-hosting in your basement is free. You know what could go wrong. But then going up a level, you're paying DigitalOcean to manage it. You're giving your man hours to maintain it, and that costs money.

Where does Fauna sit on the scale for man hours and cost as it scales?

00:28:24 - Brecht De Rooms

Yeah, you're right. It's always a trade-off between investing the man hours, which a startup usually does to save costs on infrastructure, and then as they grow, they start realizing they can't maintain it. Then they move to services, more or less where Fauna sits.

It takes away all your operational needs, and that includes replicating your data, which is extremely hard, especially if you want to keep your data consistent. You mentioned Postgres a while back, and that's funny because I just had an article sent to me from someone who uses Postgres and lists the ten most common pains with Postgres. One of them is that when you start replicating data, it becomes extremely complex to get it right. It's also no longer consistent. If you want it consistent, there are tricks for that. People go to extreme lengths to use their SQL database distributed, multi-region, and still keep it consistent. Some people invest a lot of work to implement those things themselves.

[00:29:32] All that work is going to take a lot of time. Are you going to get that right? What validation do you have that it's actually working properly, especially at scale? That is the work we take away from you.

It's also work, as I mentioned, that is tested by Jepsen. They test these kinds of systems for a living. Fauna sits on the highest tier you can get as a service, so it takes away a lot of time. In terms of price, I definitely don't think Fauna is one of the most expensive databases. The fact that it's built as a multi-tenant system from the ground up means we can optimize much more than most databases can, because most databases spin up specific resources for specific clients, and that creates a lot of overhead.

It also means that if you start to scale up, maybe those resources aren't scaled up in time.

[00:30:28] If you can amortize that by having many users, you can offer a better price. In terms of price, there's actually a comparison with DynamoDB that was just released. One of our founders wrote it himself and tested out a lot of different approaches. DynamoDB is cheaper when you start, but once your application gets more complex and you need more features, Fauna takes over and becomes cheaper, which is typically what happens in many comparisons we see.

00:30:59 - Christopher Burns

It's really interesting that man hours and time spent managing are really expensive and people don't think about that. You'll talk to your friend, say, "I'm looking at getting a database." They'll go, "Just go with Postgres. Facebook uses it." And you go, "One of the biggest tech companies ever uses it. Problem solved."

But then you have to think, how much money are they spending maintaining it, doing all these replications? It's a real hard one.

00:31:33 - Brecht De Rooms

It's also a funny argument because a company like that doesn't use one database. They typically use many databases. I had a similar story where someone said we should use this technology because a famous company uses it. And then I said, actually, no. They used to use it, and now apparently they're extremely negative about it.

That's the problem of our time. Technology is accelerating so fast that we can't keep up. It's impossible to know everything that's out there. On top of that, you have well-done marketing—and I'm saying well done because it's their job to promote the positive points of a product—and many are doing everything they can to hide the negative points. At Fauna, and that's why I love working there, we don't hide the negative points. We're always pretty honest about the things that aren't working well or what we're not suited for.

If you have that missing information, then the only thing we can do is look at other people who are famous and have done something and see whether they failed or succeeded.

[00:32:43] What you don't know, of course, is whether they have a vested interest in that specific technology. We will always make the wrong choices unless we research things or look at independent people. As I mentioned, Jepsen is a perfect example—look at what they're saying about a technology and read up before you commit.

00:33:04 - Christopher Burns

You may want to commit to using it in a large-scale application. You may want to build your own side project with it. What's really good about Fauna is that it has a free tier. You can get going without paying a penny.

00:33:16 - Brecht De Rooms

How feasible is it to write something that runs on the free tier?

00:33:22 - Christopher Burns

Exactly. Not just feasible. I mean, generous. How generous is the Fauna free tier? Could you build a whole side app with it?

00:33:33 - Brecht De Rooms

I don't think this question is a coincidence because we just recently changed our pricing. You're probably aware of that. We used to be extremely generous, but the free tier is essentially meant for development, not production.

It's priced so you can easily develop, and I've never run over the free tier during my time at Fauna. Developing example applications, I didn't even get close, but it's not meant to run in production. If you run in production, you will easily go over the free tier, and then we'd advise you to take the smallest plan available to get started. If it's purely about development, you can also use a Docker image.

00:34:17 - Christopher Burns

Just to give a comparison: you have three main tiers and custom tiers for enterprise. You have individual, team, and business. Looking at them versus how much I pay for my Postgres database, I sit in the middle of individual and team. Do you feel that it's a gradual process upwards as your application grows?

00:34:40 - Brecht De Rooms

I'm not sure whether I understand what you mean with, "It's a gradual process upwards."

00:34:45 - Christopher Burns

As you go from individual to team to business, for example.

00:34:50 - Brecht De Rooms

In your case, you'll never go to a team license because you're not a team. The main feature that's different there is team management and extra features like third-party authentication. The only reason for you to go up would be, "I want support. I want my support queries prioritized. I want to start a team. Other people are coming in and I'm not comfortable sharing my accounts."

I'd assume you'll stay a solo developer for a while on your personal application, so it doesn't make sense to go up unless you need a feature. You'll just be pay-as-you-go from the moment you go over your budget. So it's a gradual increase—metered usage from the moment you go over the $25 budget.

00:35:44 - Christopher Burns

Price is one of the things you think about. Postgres databases are priced per minute, I think. Is it second or minute?

00:35:52 - Brecht De Rooms

Per minute per machine, I assume, depending on where you host.

00:35:56 - Christopher Burns

Per minute per machine. Somewhere like DigitalOcean, I think I spend about 50 pounds in UK money. The main reason for that, we had to host our data inside the UK. Three tiers on things like Postgres are limited to three regions of DigitalOcean.

00:36:15 - Brecht De Rooms

Yeah, and tiny machines probably.

00:36:17 - Christopher Burns

Yes, I'm sure Anthony has loads of other questions that he wanted to ask.

00:36:23 - Brecht De Rooms

I would say one more thing though: you're sitting at about $50 a month for your Postgres database. The thing is, you probably don't have replication for that. You don't have multi-region. If you put a load on the database that it can't handle, you have a problem. These are the things we provide as a service, so you can't really compare it easily.

If you're going to max out your Postgres database and use it as efficiently as possible, you'll be cheaper off. But you're not prepared for all these disaster scenarios. That's the trade-off.

00:36:58 - Christopher Burns

This conversation is making me sweat. It's making me think. Have I made a bad choice? Is it too late to change?

00:37:06 - Brecht De Rooms

It depends on whether you're going to hit that moment where suddenly a lot of users are starting to query, and depending on how your query patterns are. It all depends.

00:37:16 - Christopher Burns

My application is a fintech application. I don't know if that's a good thing or bad thing.

00:37:22 - Brecht De Rooms

I'd assume that's the kind of domain where you would value the guarantees Fauna provides very highly. So that's a good idea to look into.

Postgres is a great database, but it provides solid guarantees only if you're responsible for making sure that when things go wrong, you reinstantiate the database as it used to be. If you can accept that when your application is down, nobody can write to it because your one database machine is down—it depends on what guarantees you want. If you're able to accept that, then I assume you're fine.

00:38:04 - Christopher Burns

My final point, like I said at the beginning, what do I care about and not care about? And this is where I will get scared very fast, is when something goes wrong. That's why I say I'm sweating after having this conversation. It's kind of like, why aren't you using Fauna?

00:38:25 - Brecht De Rooms

It's a paradox, actually. You're looking for the best developer experience, but without doing that, you're actually setting yourself up for the worst developer experience after X time. So it's actually funny how we work.

I mentioned that I wrote this series. If you look for "consistent backends and UX" on CSS-Tricks, then you'll find those.

00:38:47 - Anthony Campolo

We'll put all these links in the show notes as well.

00:38:49 - Brecht De Rooms

Awesome, cool.

00:38:50 - Anthony Campolo

I'd like to talk about some of the new things Fauna has been working on. I know some stuff you've recently been rolling out has been around authentication and authorization, and then some stuff around real-time streaming, which is a bit of a weaselly term. So we'll define that more clearly. That's what I'm curious to hear about.

00:39:14 - Brecht De Rooms

Always happy when people try to define fuzzy terms, because we have way too many fuzzy terms in computer science. We've recently delivered a number of features. One you didn't mention is teams, so you can work in a team on your application and share your database with multiple people.

Third-party authentication is basically what an API should do—integrate easily with other APIs. You can use a third-party JWT token, say from Auth0 or Okta, to directly access the database, which is pretty cool because normally you have to take the token in your backend and transform it to a database token. You can now write your security rules in Fauna based on the contents of that token, which eliminates a lot of work.

The streaming we provided is push-based streaming. From the moment you change something in the database, you get a notification of that change in the thing that is listening, for example your client. It's great for UI redraws and things like that.

What it's not doing in the backend is polling all the time—asking the database, "Was there a change? Was there a change?" Because at a certain point, that becomes very expensive. If you want super low-latency streaming, polling isn't going to work because you're wasting resources all the time. It's really a subscription-based thing.

You subscribe to something. It uses HTTP/2 behind the scenes to stream the data directly to you. Currently, that only works on documents, so you can open multiple streams on multiple documents and get updates and creates. But you can't say, "Give me everything that is added to this collection," which is an extra level of usefulness, and that is coming next.

[00:41:22] We're going to continue on streaming, meaning we'll offer collection streaming, or streaming the result of an index. You query something, get the initial results, and then get updates of things that arrive later—updates or new data created on the fly by other users—which significantly simplifies developing a UI.

Our main goal for 2021 is to focus on developer experience. Our idea was to provide all the building blocks and power you need as a developer, and developers will compose these things and put them together. Although that works for some developers, others really want higher-level features. Some queries, like "select all" or "select all where," are the kind of queries you do immediately when you pick a database and start testing it.

Those kinds of queries should probably become higher-level features so the language for common patterns becomes easier. You should still get access to the low-level power of our query language, but you also get those higher-level functions.

We also hope to integrate more with GraphQL, make our GraphQL better, and make sure you can say, "This specific part of a GraphQL resolver process, I want to overwrite with an FQL function and completely customize this," and then continue GraphQL on the result set of that.

We also have other things that often come back as feature requests. I'm not certain yet whether we'll focus on those, but they include webhooks, full-text searching, and bringing subscriptions to GraphQL. And of course, we intend to continue adding new regions, focusing on scalability, and improving the overall system—making users more productive and making it less painful to work with scalable data.

00:43:41 - Anthony Campolo

I find the differences between FQL and GraphQL really interesting because they're both generic languages for querying and writing data, but they're so different. It's interesting to see how you can mix and match them.

What you were saying about extending FQL—I feel like the language is already super set up to do that, being really functional and with user-defined functions. It basically sounds like Fauna has to write a couple more really good user-defined functions that everyone can use and that are conventions we can align on.

00:44:23 - Brecht De Rooms

Yeah, indeed. We could just write a few libraries, which some users already did, with higher-level functions. But we want to make sure these are baked into the system so everyone can use them easily and get them by default without installing a library. You're right, it's already set up to do that.

00:44:43 - Christopher Burns

That was literally going to be my last question—this is the selling point for me. How do you talk to Fauna? From what I understand, it's currently a GraphQL API and you don't have a JavaScript library like Prisma does.

00:45:00 - Brecht De Rooms

That's not entirely correct. I would say it's not a GraphQL API—it's a database that happens to have a GraphQL API as well. The primary way to access Fauna is using the Fauna native query language, FQL, the Fauna Query Language.

What's cool about FQL, as Anthony mentioned, is it's super functional, and that allowed us to easily take a GraphQL query and translate it behind the scenes to an FQL query. That's super hard to do in SQL or in another query language. In FQL it was relatively easy, exactly because we're super good at the tree-walking problem.

FQL was first and then we provided a GraphQL interface, which generates a bunch of things for you and automatically translates GraphQL queries to FQL. That means you get the same guarantees for the database, which is pretty cool because most GraphQL providers do completely different things. They do multiple queries to the database where you already lose the consistency advantage.

[00:46:11] But let's not go too much into detail there. The fact that you translate GraphQL to FQL makes things possible. You can override a specific part of that process. Right now you can override a complete GraphQL query with an FQL query and say, "This is not going to be automatically generated. I'm going to write this resolver in FQL." That's already possible.

What I'm imagining is you'd say, "I start writing a GraphQL query, and this part I'm going to custom-write in FQL and insert that." Since it's functional composition, you could insert an FQL query inside that GraphQL query, which would be the perfect hybrid between something super usable that everyone knows, like GraphQL, and an extremely powerful language to extend it however you want.

00:47:10 - Christopher Burns

So the FQL can be called in JavaScript.

00:47:15 - Brecht De Rooms

Yes.

00:47:16 - Christopher Burns

So it's kind of like a Prisma 2 client.

00:47:18 - Brecht De Rooms

I wouldn't say.

00:47:20 - Anthony Campolo

The way it works, you import it and you can write FQL queries inside your JavaScript code, and the queries are exactly the same. You just stick a q in front of it—that's referencing the Fauna client. No matter what, you're still just writing FQL, even though you have a JavaScript library. You can put it in your JavaScript code, but it's really still just FQL.

00:47:46 - Brecht De Rooms

Every driver is different in the sense that we try to make the languages feel native. We provide some features of the host language to make it feel more natural, but in essence, FQL is a bunch of functions. These functions are exposed by a library—it could be the JavaScript library—and you do function composition to construct your query.

00:48:15 - Christopher Burns

I have so many thoughts to think about. I think I need to seriously go away and think about this.

00:48:24 - Brecht De Rooms

In most JavaScript drivers for a random database, you'd go db.collection.create.something. In Fauna, you specify a collection, then wrap another function around that, then wrap another function around that. It's just a different paradigm in programming.

Does that mean we're going to stay that way? That's not certain because many people prefer the first way. We might end up going closer to that because many users are asking for that experience.

00:49:03 - Christopher Burns

One of the things I really like about Prisma 2 is how predictable it is. As soon as you pick your object, you can do find one, find many, delete, and that's never going to change. FQL seems a bit more fluid at this current stage. As someone who hasn't looked at the FQL library yet, I'm going to.

00:49:26 - Brecht De Rooms

From the moment we have a solid Rust driver, it should be trivial to provide a Prisma client. Maybe that will be the day you're convinced to use Fauna as your production database.

00:49:40 - Christopher Burns

I think Prisma is working on data migration as well. If there was a Rust driver and Prisma could do the migration, would it make my job easy?

00:49:52 - Brecht De Rooms

Absolutely.

00:49:54 - Christopher Burns

I need to think about this. I have so many thoughts now that are worrying. They're just very worrying.

00:50:04 - Brecht De Rooms

Isn't it the mix, like worrying and exciting at the same time?

00:50:08 - Anthony Campolo

Do you have anything else you wanted to say before we close out here? Any links you want to give or just kind of final words for the listeners?

00:50:16 - Brecht De Rooms

There's nothing specific I want to say. I want to thank you for inviting me here. This conversation has been super interesting for me as well. Thank you for having me.

00:50:28 - Christopher Burns

Where can our subscribers find you?

00:50:32 - Brecht De Rooms

I'm Databricks on Twitter, so that's the best way to find me. And if you want to contact me via one of the official Fauna channels, there's community where you can find our forums and our Slack channel.

00:50:48 - Christopher Burns

Thank you so much for joining us. I'm glad that, wow, databases can be this really complex black box. He kind of took a different path and took it back to its basics to a certain extent.

00:51:03 - Anthony Campolo

Thanks a lot. Have a good one.