Computer Vision

We learn how to cook by watching and following more experienced cooks. The final appearance of the meal affects our enjoyment of it.

At Cookpad, we use Deep Convolutional Neural Networks to classify ingredients, recipe steps and finished meals. We suggest tasty-looking food photos to our users. We use image embeddings to match food photos with recipes and action recognition in videos to understand what the cook is preparing. The multitude of ingredients and the complexity of cuisines and cooking styles is enough to deep-fry any GPU!

Our Activity

Publications

Dividing and Conquering Cross-Modal Recipe Retrieval: from Nearest Neighbours Baselines to SoTA

Mikhail Fain, Andrey Ponikar, Ryan Fox and Danushka Bollegala

arXiv preprint arXiv:1911.12763

Paper (PDF, 611K)

SRGAN for Super-Resolving Low-Resolution Food Images

Yudai Nagano and Yohei Kikuta

Proceedings of the 10th Workshop on Multimedia for Cooking and Eating Activities (CEA 2018)

Learning Food Appearance by a Supervision with Recipe Text

Atsushi Hashimoto, Takumi Fujino, Jun Harashima, Masaaki Iiyama, and Michihiko Minoh

Proceedings of the 9th Workshop on Multimedia for Cooking and Eating Activities (CEA 2017)

Cookpad Image Dataset: An Image Collection as Infrastructure for Food Research

Jun Harashima, Yuichiro Someya, and Yohei Kikuta

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017)

Paper (PDF, 888K)

Approaches to Food/Non-food Image Classification Using Deep Learning in Cookpad

Yohei Kikuta, Yuichiro Someya, and Leszek Rybicki

Proceedings of the 9th Workshop on Multimedia for Cooking and Eating Activities (CEA 2017)

Paper (PDF)