Inside the Risk Engine's tech stack

A quick overview of our tech stack

Jian Yuan Lee

November 12, 2021

Hello world

The Risk Engine is a new product currently in development. It uses machine learning to generate automated risk assessments on property transactions by analysing all the legal documents a lawyer would normally review.

To support our plans for the product and its continued growth, we made key selections about our technology stack. This article covers some of them.

Infrastructure

Being born in the cloud-native era, we’re fortunate to have high quality tools at our disposal which are easy-to-setup and learn. We use Docker, Kubernetes and Azure Kubernetes Service (AKS) very extensively.

On the load balancing side, we use Nginx managed by the excellent ingress-nginx controller. We also use cert-manager to issue, renew, and rotate TLS certificates for our endpoints automatically, reducing our ongoing maintenance burden.

We leverage Terraform as our infrastructure-as-code solution for our non-Kubernetes infrastructure components, namely managed PostgreSQL and Redis instances and even the initial Kubernetes control plane set up.

GitOps

As the size of our applications and infrastructure grew, the number of Kubernetes manifests increased. To manage Kubernetes effectively, we employ GitOps practices using Flux CD.

GitOps is a way of using Git as a single source of truth for managing our infrastructure and application declarations. We make use of Helm charts, Kustomization templates, and Flux CD’s variable substitution feature to manage our Kubernetes manifests. This tool no doubt increased our adoption of Kubernetes.

For continuous deployment, we use Flux’s image automation capabilities to update our manifests automatically when there is a new Docker image to deploy.

Backend stack

Our primary language is Python, used by both the data science and engineering teams. We find the syntax easy to read and the package ecosystem extensive, making knowledge transfer easier. We make use of automated tools like pre-commit, Make, mypy, and automatic code formatters to reduce the cognitive overhead of our work. We have a hint of Node.js via Next.js’s server-side rendering capabilities.

Our database of choice is PostgreSQL. Not only is it a reliable, robust, and resilient object-relational database, but it also has excellent NoSQL capabilities like support for key-value stores and documents.

Frontend stack

We use TypeScript, and thanks to its strong typing, our code is easier to read and understand, and the compiler can catch mistakes in our code editor. We find this to be a huge productivity boost, especially in large codebases.

We use Next.js React Framework for its “batteries-included” approach, allowing us to focus our efforts on building what matters to us. We also take advantage of its server-side rendering capabilities to speed up some of our computationally heavy pages.

Data stack

Matt Westcott, our Lead Data Scientist, recently published a great blog post on Full-Stack Data Science at Orbital Witness, where he explained some of the tools we used in detail.

What’s next?

Although we have basic health checks, such as checking whether the database is online and whether the webserver is up, they do not cover what a real user might experience. We are exploring progressive delivery techniques (using Flagger) like rolling out changes by gradually shifting traffic based on metrics like HTTP request success rate.

We are also exploring serverless technologies (using Knative), especially for sporadic workloads like machine learning training jobs and document processing pipelines.

Stay tuned for future posts where we will dive into each area above in detail!

Like the sound of our tech stack? We are hiring.

Jian Yuan Lee

Tech Lead - Risk Engine