Testing FastHTML Dashboards
Building dashboards to visualize data or the results of experiments is the bread and butter of data people (read: data scientist, engineers, analysts, etc.). Often, these dashboards are hacked together in record time to meet a presentation deadline. Now imagine this: you built a dashboard for showcasing your latest model to your team. Instead of your go-to tool, Streamlit, you decided to try out FastHTML, a shiny new framework that promises better handling and scalability if your dashboard ever needs to go bigger. Your team lead is so impressed with your model that they want to show it to the whole company. That is your chance to shine! With FastHTML, you don’t have to worry about scaling to a bigger audience. But wait: are you sure your dashboard is really working as expected? How can you be certain nothing fails if the CEO happens to use it? Normally, you would go for automated testing, but after scouring the FastHTML documentation on how to do it, you found nothing. ...
Compose Datasets, Don't Inherit Them
In relatively young disciplines, like deep learning, people tend to leave behind old principles. Sometimes this is a good thing because times have changed and old truths, i.e. over-completeness being a bad thing, have to go. Other times, such old principles stick around for a reason and still people over-eagerly try to throw them out of the window. I am no exception in this regard so let me tell you how I “re-learned” the tried and true design pattern of “Composition over Inheritance”. ...
About Copying Blindly - Insights of the One-Eyed Person
Today I have no fancy project and no shiny GitHub repository to show because today I want to talk about my research. As some may read in my bio, I am doing my PhD in the field of predictive maintenance (PDM). This field is directly adjacent to machine learning and fell, like many others, into the grasp of the deep learning hype. Unfortunately, long series of sensor readings are not as interesting to look at as images or intuitively understood as natural language, so PDM is not as present in the mind of the general ML crowd. Maybe this post can shed a little light on this corner of the research world. ...
The Great Autoencoder Bake Off
“Another article comparing types of autoencoders?”, you may think. “There are already so many of them!”, you may think. “How does he know what I am thinking?!”, you may think. While the first two statements are certainly appropriate reactions - and the third a bit paranoid - let me explain my reasons for this article. There are indeed articles comparing some autoencoders to each other (e.g. [1], [2], [3]), I found them lacking something. Most only compare a hand full of types and/or only scratch the surface of what autoencoders can do. Often you see only reconstructed samples, generated samples, or latent space visualization but nothing about downstream tasks. I wanted to know if a stacked autoencoder is better than a sparse one for anomaly detection or if a variational autoencoder learns better features for classification than a vector-quantized one. Inspired by this repository I found, I took it into my own hands, and thus this blog post came into existence. ...
Make DL4J Readable Again
A while ago, I stumbled upon an article about the language Kotlin and how to use it for Data Science. I found it interesting, as some of Python’s quirks were starting to bother me and I wanted to try something new. A day later, I had completed the Kotlin tutorials using Kotlin Koans in IntelliJ IDEA (which is an excellent way to get started with Kotlin). Hungry to test out my new language skills, I looked around for a project idea. As I am a deep learning engineer, naturally I had a look at what DL frameworks Kotlin had to offer and arrived at DL4J. This is actually a Java framework, but as Kotlin is interoperable with Java, it can be used anyway. I had a look at some examples of how to build a network and found this (Source): ...
How to Trust Your Deep Learning Code
Deep learning is a discipline where the correctness of code is hard to assess. Random initialization, huge datasets and limited interpretability of weights mean that finding the exact issue of why your model is not training, is trial-and-error most times. In classical software development, automated unit tests are the bread and butter for determining if your code does what it is supposed to do. It helps the developer to trust their code and be confident when introducing changes. A breaking change would be detected by the unit tests. ...