Designing applications that keep users in their flow

Using Go, gRPC, grpc-web, and Remix to deliver blazing fast application performance.

May 16, 2022

Applications that load fast feel great.

There is a science behind it.

Google research suggests that web applications that deliver page load times of less than 100 milliseconds (which is how long the Occipital lobe stores information as sensory memory) can create the illusion that your web application delivers instantaneous response.

On the other hand, it takes only one second, and users lose their focus!

And there is more research behind it:

“In 2011, CA Technologies commissioned Foviance, a customer experience consultancy, to conduct a series of lab experiments at Glasgow Caledonian University. The participants wore an EEG (electroencephalography) cap to monitor their brainwave activity while they performed routine online transactions. Participants completed tasks using either a 5 MB web connection or a connection that had been artificially slowed down to 2 MB. Brainwave analysis from the experiment revealed that participants had to concentrate up to 5o% more when using websites via the slower connection. When asked what they liked most and least about the websites they used during the study, participants frequently cited speed as a top concern.”

This point of view certainly puts performance in a different perspective:

It's not about saving time. It's about ensuring that users can keep their focus.

We apply the same thinking when designing synq. We’re building tools for the workflows of busy people. We want to ensure they see no unnecessary loading screens and we have made design choices across our entire system to make it happen.

Architecture for performant user experiences

There are multiple layers to our system and for the sake of end user experience, I will focus on life-cycle of UI render, skipping all data processing that happens behind the scenes.

Go gRPC APIs

Loading an application from scratch involves a variety of data from our APIs. We’ve segregated our API into gRPC services, implemented in Go. Golang is proven to be an extremely performant language, excellent at delivering high-performance without allocating excessive amounts of resources.

In our case, each service provides a set of endpoints that together provide cohesive functionality. For example, our workspace service:

service WorkspacesService { rpc Read(ReadRequest) returns (ReadResponse) { ... } rpc List(ListRequest) returns (ListResponse) { ... } rpc Create(CreateRequest) returns (CreateResponse) { ... } rpc Update(UpdateRequest) returns (UpdateResponse) { ... } rpc Delete(DeleteRequest) returns (DeleteResponse) { ... } }

gRPC services are organised into microservices, where most of our microservices provide multiple gRPC services. For example, our accounts microservice with WorkspacesService, UsersService, and IntegrationsService.

This degree of segregation can scale and we can tune individual microservices in isolation to ensure fast load times.

To squeeze maximum performance from individual endpoints, we further ensure that:

We avoid cascading service calls, when a client calls service A that calls service B, that calls service C. That would be slow, not to mention that it would tightly couple our systems in runtime. If we need that data, we call A, B, and C directly from the client application. Its very composable.

Data loading requests have a small footprint on the service. This one is hard to solve generally, but we often optimize our API queries to have a low impact on the server—avoiding complex database joins or unnecessary runtime calculations. We precompute what we need to serve APIs fast.

grpc-web

Golang gRPC microservices are fast on their own, but we wanted to push the performance to the next level. We use grpc-web to communicate with our FE clients. Compared with commonly used JSON, it is better at leveraging http/2, has smaller binary payloads, faster serialization, and overall lighting performance.

With buf, we can lint our .protos and generate Typescript code. We use ts-proto as it generates idiomatic TS code that can serialize well.

Ok, we got data from microservices to FE. What next?

Remix + Vercel

Remix is a full-stack web framework that lets you focus on the user interface and work back through web standards to deliver a fast, slick, and resilient user experience. People are gonna love using your stuff.

And it delivers.

Remix is a hybrid server/browser framework. It runs part of the code on the server and renders full HTML before sending data to the browser.

“Most web apps fetch inside of components, creating request waterfalls, slower loads, and jank. Remix loads data in parallel on the server and sends a fully formed HTML document. Way faster, jank free.”

As a result, we can describe user interfaces and data loaders in simple Remix code and elegantly plug in our gRPC + grpc-web infrastructure in Remix data loaders. We deploy our front-end to Vercel to let Remix loaders run close to the user on edge.

The following diagram describes the architecture:

Let’s walk through it end to end once more. The life-cycle of the UI screen is the following:

The remix will kick off its loader that runs on Vercel (on edge) close to the user. The loader function calls one or multiple of our APIs in parallel.

export const loader: LoaderFunction = async ({ request, params }) => { … let [asset, upstream, downstream] = await Promise.all([ loadAsset(path), loadUpstream(path), loadDownstream(path), ]) … }
All API calls go through a lightweight proxy in Envoy that multiplexes multiple services under a single API host.
Our APIs receive calls via grpc and load data from their storage with high-performance Go code.
As the remix loader receives data from all APIs (in parallel), it combines this data and renders HTML that is shipped to the browser to render

It is pretty simple; it just took some time to get right initially as we had to find the right set of libraries and patterns to use. From now on, we just reuse them.

The bonus!

Remix comes with an elegant feature—prefetch. When declared, our app will prefetch data for the target screen when the user hovers over a link that leads to it. And due to our fast APIs, it feels like magic. Why? We can often prefetch the entire next page before the user clicks.

The preliminary results

The initial load of the application takes, on average, ~500-600ms. The consecutive screens load on average in ~200-300ms, but often we will see a (prefetch cache), 1ms in the browser network tab indicating that a new screen loaded instantly, in 1ms. Just click, and it's there.

I do not doubt that we have many complex challenges ahead, but it's certainly a good start with gRPC, grpc-web, Remix, and prefetch enabled in critical user flows. We can minimize the overhead of getting data in front of the user. Our focus can be on the performance of the underlying microservices that, in the end, drive how fast our experience will be.

If you’re equally passionate about building delightful user experiences with software, drop me a line at petr@synq.io. We are hiring great engineers.

The Engine Room

Discussion about this post