Hi guys,
So I learned about this challenge and AI Crowd in general only yesterday. But I have somewhat been “virtually” participating in it because Luca Turin’s Youtube video series about the vibrational theory of smell gave me the idea of looking at functional groups from the SMILES representation to predict organoleptic tags. I kid you not, I started a thread on Basenotes not knowing that the Learning to Smell challenge was simmultaneously taking place on AI Crowd. It’s so weird and a little frustrating for me that I could not participate in the first round. But here I am now to make the most of it.
My interests for this challenge are:
- augmentation of the SMILES representation with functional groups, GPCR data (if possible), stereochemistry and chirality data (when applicable), predicted or empirical NMR spectra from molecular structure, etc.
- organoleptic ontologies, taxonomies, tags and otherwise NLP on textual descriptions of ACs (and formulae). I think I would also like to exploit the fact that french is my native language to delve into the abundance of “legacy” perfumery texts in french.
So many nice ETL things to try upstream of the ML itself here!
But outside of this challenge, my main focus point right now is the construction of machine-learned olfactive metric spaces on which are defined:
- a “distance” between aromachemical “points” in the space
- an additive operator where any weighed sum of points in the space (the formula) is also a point in that same space.
I’m currently having fun speculating on those spaces’ topology/dimensionality with many people all over the Web. The purpose of all of this is of course, automatic perfumery formula composition.
I also lead a perfumery data preservation project and I actually spoke to John Leffingwell (the source of the data for this challenge) on the phone about the SMILES representation of ACs in his database and the preservation of the data itself two weeks ago (again, before I even knew of Learning to Smell). I have been scraping The Good Scents Company data from the web for a couple of years also, and I am very familiar with perfumery data, both physical/chemical and organoleptic.
Enough about me already ! Anyone wants to team up (I’d rather not do this alone) ? I think it’s important to mention that my intention is for everything that we do here to be open source from the get go. The best way to reach me is through the AI Crowd discord or via Chacha Sikes’ Aroma Discord where I first learned about this challenge and where there is the coolest crowd of scent and flavoring technologist around, I think. I’m Contrebande#2840 on Discord.