I don't know exactly yet, but I have some ideas
"Answering this question is the mission I've been given at Together. This is my third week as I write these words, so my direct answer would be: I don't know exactly yet, but I have some ideas.
If matching brands was a common sight, a staple of data marketing playbooks, I wouldn't need fresh ideas, I'd just apply the most robust methods I'd find like the engineer I am. That's not the case, I am told. Barely anyone does this, not at scale, not in a legible way. So I have to experiment. Let me show you how it goes.
It all starts with understanding the problem, how it's solved by humans. What do people look for in a partnership? Common values, a not-too-big size difference, and of course you won't strike deals with your direct competitors. That's not the whole picture, but it's enough to get me going. Where's the data for that? Can I infer good partnerships from signals I can reach?
Wait a minute... "good"? I'm trying to figure out the value of something, this is similar to those "good" recommendations apps give you: they're not good in any absolute sense, they're just guessing that you'll like them better than anything else available. We can compare that. There's a whole toolset for that: ranking methods. I make a note to document myself on this.
Back to data sources
We can indeed fetch a lot of information about brands from the Internet. Some of it is scraped from their public-facing websites. Some of it comes pre-packaged from third-party APIs. I can work with that! Aggregating sources, I start making hypotheses about the data. I come in with assumptions that I relentlessly test.
For instance, the size of a brand's audience. I vaguely remember from my university courses, and Barabási's Network Science that they're supposed to follow a power-law distribution (like the size of cities: many small ones, and a few big, at any level of zoom). But that's what I've been taught. What if I remember wrong? What if brands are an exception to the rule? I test it, and it turns out that they do follow a power-law distribution as expected.
Some other times my assumptions are wrong, and I learn something about the problem: looking at the language brands use in their communication, some of it does reflect their values... but much of it is generic marketing speak, that is not useful for my analysis, so I must find another way to separate the signal from the noise...
Those are the first steps. I iterate, try to reshape the information into visible patterns. I ask for feedback from my colleagues, who actually know more marketing than me and help me make sense of the strange results I get. In the end, I'm able to have a very rough idea of what makes a good match.
Not perfect. Good enough. I swap my data scientist hat for my engineer hat, because those numbers are pretty but they need to reach our customers somehow. I'll have plenty of time to make it better, try more ideas.
Maybe I'll never actually know how to match brands perfectly. But I can get close enough."
Read more about Jérémy's published research and thesis.