Dividing and Conquering Cross-Modal Recipe Retrieval: from Nearest Neighbours Baselines to SoTA
Mikhail Fain, Andrey Ponikar, Ryan Fox and Danushka Bollegala
arXiv preprint arXiv:1911.12763
We learn how to cook by watching and following more experienced cooks. The final appearance of the meal affects our enjoyment of it.
At Cookpad, we use Deep Convolutional Neural Networks to classify ingredients, recipe steps and finished meals. We suggest tasty-looking food photos to our users. We use image embeddings to match food photos with recipes and action recognition in videos to understand what the cook is preparing. The multitude of ingredients and the complexity of cuisines and cooking styles is enough to deep-fry any GPU!