Skip to main content

ยท 3 min read
Ziinc
๐Ÿ‘‹ I'm a dev at Supabase

I work on logging and analytics, and manage the underlying service that Supabase Logs and Logflare. The service do over a billion of requests each day with traffic constantly growing, and these devlog posts talk a bit about my day-to-day open source dev work.

It serves as some insight on what one can expect when working on high performance and high availability software, with real code snippets and PRs to boot. Enjoy!๐Ÿ˜Š

This week, I'm hunting down a few things that are plaguing Logflare, namely:

  1. High memory usage over time
  2. Sporadic memory spikes and system slowdowns

For the first one, the root causes were quite straightforward: High garbage collected

There were a few culprits:

  1. RecentLogsServer - This module is tasked with updating a counter for total events ingested in the table, periodically updating it. However, due to the small change in state, there were very few minor GCs triggered, resulting in a major GC never getting triggered.
  2. SearchQueryExecutor - This module is tasked with performing search queries as well as live tailing in the Logflare dashboard, and due to the amount of state that was kept and constantly updated, fullsweeps were not getting triggered, resulting in large buildups in garbage over time.

How the Erlang garbage collection works is really beyond the scope of this discussion, but a very detailed explanation is available in the official docs.

For the second issue, where the system would sporadically spike, the

This run quue spike would cause the VM to "lock up", causing a few downstream effects:

  1. GenServer calls would start timing out and failing, as processes lock up and message queues build up, resulting in sudden spikes in memory.
  2. Incoming requests would be served slowly, resulting in a large slowdown and high latency. Incoming request payloads will also consume memory.
  3. Downstream API calls would get affected, as API calls would slow down, even non-ingest API calls.

However, run queue buildup is only just a symptom of the true problem, which required further diagnonsis and analysis.

Thankfully, we were able to narrow down the root cause of this run queue spike to the Cachex Courier.

The courier is responsible for handling much of the value retrieval of the main Cachex.fetch/4 function, and ensures deduplication of value retrival. However, it was possible that an error in the value retrieval would cause the process to lock up and stop responding to caller processes. This would then result in a flood of GenServer.call/3 failures, as calling processes would timeout. However, due to the throughput of request and data that Logflare handles (multiple billions of events a day), sudden large slowdowns in the ingestion pipeline would result in a snowball effect. This could be felt in more obvious downstream services, such as the Supabase dashboard, where certain heavily used API endpoints would fail sporadically.

It just so happened that this exact issue was patched in the latest Cachex v4.0.0 release, so upgrading to the latest version was sufficient.

The fix specifically involved adjusting the way that the value retrieval was performed such that it would spawn a linked process to perform the work instead of doing it within the process, while also ensuring that exits for the process were trapped. By trapping the exit, it could notify all callers that an error had occured and let the errors propagate upwards instead of blocking the caller until a timeout occurred.

The final Logflare adjustments can be found in these changes, which resulted in a 3.5x memory reduction from and a 5-7% CPU improvement at production workloads.

Impact on memory after tweaking

Impact on scheduler utilization

ยท 2 min read
Ziinc

Quite surprisingly, Supervisors do not have an exposed option for taking a spawn_opt. spawn_opt are process options that are used to control the process behaviour when it comes to memory management, and can be incredibly useful when hunting down garbage build-up in processes.

The Backstoryโ€‹

This week in life at Supabase, we have some fun garbage collection optimization, and it mostly involves tweaking culprit process behaviours into clearing out their garbage in a timely manner.

Sometimes, garbage might build up as shown for a myriad of reasons, and we gotta take our massive major GC hammer to knock some sense into these processes that are stuck in a minor GC loop!

The Problemโ€‹

So, Supervisors don't actually take a spawn_opt, so after digging around, the only real option was to use the :erlang.process_flag/3 function, which is wrapped by Process.flag/2.

We can achieve the the :fullsweep_after tweaking as so:


def init(_arg) do
# trigger major GC after 5,000 minor GCs
Process.flag(:fullsweep_after, 5_000)
...
end

One would think that it would be accepted by Supervisor.start_link/2, but it seems like it isn't at all, and I had to dig into the Elixir source code to find that out.

A Word on Task.Supervisorโ€‹

Although the base Supervisor module doesn't accept the :spawn_opt option for its start_link/2 callback, the Task.Supervisor built-in module does accept it.

This can be see here where there is an explicit test case for this option passing.

Quite an interesting tidbit ๐Ÿ˜„

ยท One min read
Ziinc

This is my handy dandy way to deploy lots of Supabase edge functions and sync my migrations all in one go:

In a makefile at project root:

diff:
# see the migration protip below!
supabase db diff -f $(f) -s public,extensions --local

deploy:
@echo 'Deploying DB migrations now'
@supabase db push
@echo 'Deploying functions now'
@find ./supabase/functions/* -type d ! -name '_*' | xargs -I {} basename {} | xargs -I {} supabase functions deploy {}

.PHONY: diff deploy

Just run make deploy and it will push the database migrations and deploy all edge functions in the supabase/functions folder.

The edge functions deploy will also ignore all folders that start with _, which is usually shared code modules and not an actual edge function that you would want to deploy.

Migration Generation ProTipโ€‹

You can also use make diff f=my_migration_name that I added in above to generate a database migration diff faster than you can say "Yes please!" (Actually the diff-ing is not very fast, so you might finish saying it before it completes. Try saying it letter by letter ๐Ÿ˜„)

ยท 2 min read
Ziinc

Wouter is amazingly lightweight, and if it can do much work in a tiny package, so can you!

Here is my tried and tested way to track routes in a React app using Wouter as the main browser router:

Installationโ€‹

This guide assumes that you already have a working router set up with wouter already installed. This is for v3 of wouter.

We'll use react-ga4 package, because popping npm package pills is more fun than hand-rolling it.

npm i react-ga4

In the App.tsx, initialize the React GA4 script:

import ReactGA from "react-ga4";

ReactGA.initialize([
{
trackingId: "G-mytrackingnum",
gaOptions: { anonymizeIp: true },
},
]);

Create a <TrackedRoute /> Componentโ€‹

import { Route, RouteProps useLocation, useRoute } from "wouter";
const TrackedRoute = (props: RouteProps) => {
const [location, _setLocation] = useLocation();
const [match] = useRoute(props.path as string);
useEffect(() => {
if (match) {
ReactGA.send({
hitType: "pageview",
page: props.path,
title: document.title,
});
}
}, [location]);

return <Route {...props} />;
};

In this example, we trigger the effect every time the browser location changes. We then check if the route matches, and if it does, we will fire off the ReactGA pageview event.

Add it to the <Router> componentโ€‹

import { Router } from "wouter";

<Router base={import.meta.env.BASE_URL}>
<TrackedRoute path="/test/:testing">some test page</TrackedRoute>
<TrackedRoute path="/">some app code</TrackedRoute>
</Router>;

And now, if we navigate to /test/123, we will see that the pageview of /test/:testing will get logged.

Note that this example only focuses on the path route that is matched, and not the actual location. This is app routes are not really the same as public content routes and the actual resource IDs are irrelevant to the web analytics.

ยท 5 min read
Ziinc

EmailOctopus is a lovely newsletter site, and I do enjoy the fact that their UI is quite well done and user friendly, and on the technical side they have EmailOctopus Connect, which is allows for lower email costs. They bill by subscriber count though, so for any seriously large subscriber counts it would make more sense to self-host, but I like using them for small projects (like this blog for example) which will never ever see more than a handful of subscribers.

However, EmailOctopus could definitely up their game when it comes to their developer APIs and scripts. Their embeddable script is an absolute pain to work with when it comes to custom websites, especially if you're using a shadow DOM.

<script
async
src="https://eomail1.com/form/983899ac-29fb-11ef-9fcd-4756bf35ba80.js"
data-form="983899ac-29fb-11ef-9fcd-4756bf35ba80"
defer
type="text/javascript"
></script>

Let me break this down for you:

  • It loads a script asyncronously. The script is custom generated for the specific form created, as can be seen by the usage of the uuid in the script URL.
  • The script will insert a form as well as load some additional some Google reCaptcha scripts for spam protection. It will also load some Google Fonts and any assets related to the form.
  • By default, it does not come wiht the defer and type attributes. These were added in by me, and would ensure that the browser executes it as JavaScript, and that execution would be deferred until the DOM is fully loaded.
  • It finds a <script> tag with the data-form attribute with that exact UUID and replaces it with the form. It then creates the required DOM nodes within the HTML page.

However, adding in the script directly to a React component would not work:

// ๐Ÿšซ This will not work!
const MyComponent = () => (
<div>
<script
async
src="https://eomail1.com/form/983899ac-29fb-11ef-9fcd-4756bf35ba80.js"
data-form="983899ac-29fb-11ef-9fcd-4756bf35ba80"
defer
type="text/javascript"
></script>
</div>
);

Why wouldn't this work?

  • React works with a shadowDOM, and thus there would not be any script tag available on the html at page load. React will mount the component on client load.
  • Even with React Server Side Rendering, the script tag would not be executed because React protects from malicious code that will set raw html inside components. One would need to use __dangerouslySetInnerHtml in order for this to work

This, we need to adjust our React code in Docusaurus to:

  1. execute the script; and then
  2. create the HTML tags at the <script> tag; but
  3. only do it client side;
Why do we want it to be only client side?

Docusaurus will generate both server and client code during the build step. Although it would would actually have some benefits have generated so that there is less JS running on client initial load, there is added complexity in trying to wrangle with the Docusaurus SSR build step, so just leaving it client side is fine as well. It also are no SEO benefits to be gained, so leaving it client side is fine.

For any other React library, this would likely be irrelevant.

Step 1: Create the Formโ€‹

Create the form inside EmailOctopus and obtain the embed script.

Example of form creation

Step 2: Add the wrapped component to your layoutโ€‹

Add in the <Newsletter /> tag to whereever you want to slot your newsletter form at. You can also swizzle one of the layout components, but how to do that is out of scope for this blog post.

import React from "react";
import Newsletter from "@site/src/components/Newsletter";
export default function MyComponent(props) {
return (
<>
<Newsletter />
...
</>
);
}

Step 3: Install React Helmetโ€‹

We'll need some way to load the script in the head of the HTML document. We'll reach for React Helmet in this walkthrough guide, so do the current variation du jour of npm install --save react-helmet.

Step 3: Add in the Newsletter componentโ€‹

For our component to work successfully, we need to create a the file at /src/components/Newsletter.tsx and define the compoennt as such:

//  /src/components/Newsletter.tsx
import React from "react";
import { Helmet } from "react-helmet-async";
import BrowserOnly from "@docusaurus/BrowserOnly";
const Newsletter = () => (
<div
style={{
marginLeft: "auto",
marginRight: "auto",
}}
>
<BrowserOnly>
{() => (
<>
<Helmet>
<script
async
defer
src="https://eomail1.com/form/983899ac-29fb-11ef-9fcd-4756bf35ba80.js"
type="text/javascript"
></script>
</Helmet>
<script
type="text/javascript"
data-form="983899ac-29fb-11ef-9fcd-4756bf35ba80"
></script>
</>
)}
</BrowserOnly>
</div>
);
export default Newsletter;

In this page, there are a few things that are going on:

  1. We set the script to the <Helmet /> component, meaning that this would be placed in the <head> tag of the HTML document. Two additiona attributes are added as well: defer to load this after the main document loads, and type="text/javascript" for completeness.
  2. We also add in the extra <script> tag in the component, with the data-form attribute to let the script identify it as the parent node to insert the form elements.
  3. We also wrap all of this inside of the <BrowserOnly /> component that comes with Docusaurus, which allows us to only run this code when on the client. As these scripts do not affect SEO, it is not necessary to include it in the server side generation.

Step 4: Verify it all worksโ€‹

Now check that it all works on your localhost as well as on production, and now pat yourself on the back!