Computer Vision

We learn how to cook by watching and following more experienced cooks. The final appearance of the meal affects our enjoyment of it.

At Cookpad, we use Deep Convolutional Neural Networks to classify ingredients, recipe steps and finished meals. We suggest tasty-looking food photos to our users. We use image embeddings to match food photos with recipes and action recognition in videos to understand what the cook is preparing. The multitude of ingredients and the complexity of cuisines and cooking styles is enough to deep-fry any GPU!

Publications

Dividing and Conquering Cross-Modal Recipe Retrieval: from Nearest Neighbours Baselines to SoTA
Mikhail Fain, Andrey Ponikar, Ryan Fox and Danushka Bollegala
arXiv preprint arXiv:1911.12763

SRGAN for Super-Resolving Low-Resolution Food Images
Yudai Nagano and Yohei Kikuta
Proceedings of the 10th Workshop on Multimedia for Cooking and Eating Activities (CEA 2018)

Learning Food Appearance by a Supervision with Recipe Text
Atsushi Hashimoto, Takumi Fujino, Jun Harashima, Masaaki Iiyama, and Michihiko Minoh
Proceedings of the 9th Workshop on Multimedia for Cooking and Eating Activities (CEA 2017)

Cookpad Image Dataset: An Image Collection as Infrastructure for Food Research
Jun Harashima, Yuichiro Someya, and Yohei Kikuta
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017)

Approaches to Food/Non-food Image Classification Using Deep Learning in Cookpad
Yohei Kikuta, Yuichiro Someya, and Leszek Rybicki
Proceedings of the 9th Workshop on Multimedia for Cooking and Eating Activities (CEA 2017)