Fifth week at Together
My mission is to relate brands together with the power of data, and here I am doing plumbing. That was the plan. A necessity, really.
It took me about three weeks, exploring brand data, then using standard techniques, to get a first experimental model. A small bunch of lines of code, that predicts pairs of brands that will have the highest probability to create a great partnership. It's far from perfect of course, but it still gives promising results, and it runs smoothly on my machine.
Sure, but... our clients don't have access to my machine. The raw results report is barely readable. I manually configured three quarters of my development environment. What I have isn't a product, not even a prototype. A proof of concept, maybe? To transmute my method into an actual service, I need to integrate it to a proper infrastructure. It was time to put my Data Engineer hat on, and start discussing some specifications with Jonas, my CTO.
It turns out, there are plenty of interesting subproblems to solve! First, putting my algorithm behind an API, an interface that the rest of our app will be able to query anytime. What kind of machine, in the cloud, does this run on? Second, managing data updates. New brands appear frequently, their public image evolves over time, so I need to find how to adapt my process in an automated and scheduled way. Third, ensuring a reasonable response time. My methods are simple and quick enough so far, but that won't last; I need to precompute as many results as I can, so that partnerships will be suggested with minimal wait.
All of this looks like plumbing, connecting pipelines of data so it goes in the right place, at the right time, in the right shape. To make databases and apps talk to each other properly, in sync. It is rather tedious to set up everything, but this infrastructure lays the groundwork for future improvements. I will be able to make that central algorithm better, while trusting its surrounding environment. I will be able to switch that specific component in place without having to rebuild everything else. I will save precious time, and headaches to the entire tech team.
Zooming out a little, what I end up with is a legible system. I can easily draw a diagram outlining all of my operations. I can set up usage and performance metrics that actually point to something. They will be my next step.