Abstract:A daily dietary assessment method named 24hour dietary recall has commonly been used in nutritional epidemiology studies to capture detailed information of the food eaten by the participants to help understand their dietary behaviour. However, in this self-reporting technique, the food types and the portion size reported highly depends on users' subjective judgement which may lead to a biased and inaccurate dietary analysis result. As a result, a variety of visual-based dietary assessment approaches have been … Show more
“…A general criterion is to choose the state-of-the-art methods that report performance on the same dataset for a fair comparison. Unfortunately, there has not been a well-recognized dataset for food volume estimation according to a recent published comprehensive review [ 38 ]. Instead, most methods reported performance on their own collected data that usually is not publicly available.…”
It is well known that many chronic diseases are associated with unhealthy diet. Although improving diet is critical, adopting a healthy diet is difficult despite its benefits being well understood. Technology is needed to allow an assessment of dietary intake accurately and easily in real-world settings so that effective intervention to manage being overweight, obesity, and related chronic diseases can be developed. In recent years, new wearable imaging and computational technologies have emerged. These technologies are capable of performing objective and passive dietary assessments with a much simplified procedure than traditional questionnaires. However, a critical task is required to estimate the portion size (in this case, the food volume) from a digital image. Currently, this task is very challenging because the volumetric information in the two-dimensional images is incomplete, and the estimation involves a great deal of imagination, beyond the capacity of the traditional image processing algorithms. In this work, we present a novel Artificial Intelligent (AI) system to mimic the thinking of dietitians who use a set of common objects as gauges (e.g., a teaspoon, a golf ball, a cup, and so on) to estimate the portion size. Specifically, our human-mimetic system “mentally” gauges the volume of food using a set of internal reference volumes that have been learned previously. At the output, our system produces a vector of probabilities of the food with respect to the internal reference volumes. The estimation is then completed by an “intelligent guess”, implemented by an inner product between the probability vector and the reference volume vector. Our experiments using both virtual and real food datasets have shown accurate volume estimation results.
“…A general criterion is to choose the state-of-the-art methods that report performance on the same dataset for a fair comparison. Unfortunately, there has not been a well-recognized dataset for food volume estimation according to a recent published comprehensive review [ 38 ]. Instead, most methods reported performance on their own collected data that usually is not publicly available.…”
It is well known that many chronic diseases are associated with unhealthy diet. Although improving diet is critical, adopting a healthy diet is difficult despite its benefits being well understood. Technology is needed to allow an assessment of dietary intake accurately and easily in real-world settings so that effective intervention to manage being overweight, obesity, and related chronic diseases can be developed. In recent years, new wearable imaging and computational technologies have emerged. These technologies are capable of performing objective and passive dietary assessments with a much simplified procedure than traditional questionnaires. However, a critical task is required to estimate the portion size (in this case, the food volume) from a digital image. Currently, this task is very challenging because the volumetric information in the two-dimensional images is incomplete, and the estimation involves a great deal of imagination, beyond the capacity of the traditional image processing algorithms. In this work, we present a novel Artificial Intelligent (AI) system to mimic the thinking of dietitians who use a set of common objects as gauges (e.g., a teaspoon, a golf ball, a cup, and so on) to estimate the portion size. Specifically, our human-mimetic system “mentally” gauges the volume of food using a set of internal reference volumes that have been learned previously. At the output, our system produces a vector of probabilities of the food with respect to the internal reference volumes. The estimation is then completed by an “intelligent guess”, implemented by an inner product between the probability vector and the reference volume vector. Our experiments using both virtual and real food datasets have shown accurate volume estimation results.
“…Moreover, especially for volume estimation, automated image-analysis requires multiple photos, or even supplementary data input using a video, which might put an increased burden on the participant (149,150). Most importantly, however, even though increasingly sophisticated algorithms are employed, the identification of food in an image as well as its classification and the estimation of portion sizes remains a challenge (149,151). The correct identification of certain food groups such as starchy foods and beverages as well as dishes combining several food groups (e.g., lasagne) is especially difficult (152).…”
Section: Features To Collect Dietary Datamentioning
Smartphones have become popular in assessing eating behaviour in real-life and real-time. This systematic review provides a comprehensive overview of smartphone-based dietary assessment tools, focusing on how dietary data is assessed and its completeness ensured. Seven databases from behavioural, social and computer science were searched in March 2020. All observational, experimental or intervention studies and study protocols using a smartphone-based assessment tool for dietary intake were included if they reported data collected by adults and were published in English. Out of 21,722 records initially screened, 117 publications using 129 tools were included. Five core assessment features were identified: photo-based assessment (48.8% of tools), assessed serving/ portion sizes (48.8%), free-text descriptions of food intake (42.6%), food databases (30.2%), classification systems (27.9%). On average, a tool used two features. The majority of studies did not implement any features to improve completeness of the records. This review provides a comprehensive overview and classification scheme of smartphone-based dietary assessment tools to help researchers identify suitable assessment tools for their studies. Future research needs to address the potential impact of specific dietary assessment methods on data quality and participants’ willingness to record to ultimately improve the quality of smartphone-based dietary assessment for health research.
“…In [54], clustering images sampled from egocentric videos into food and non-food classes has been attempted, but the types of food were not recognized. For more comprehensive reviews of image-based approaches, we refer readers to [55] and [56].…”
Section: A Technological Approaches For Dietary Assessmentmentioning
Assessing dietary intake in epidemiological studies are predominantly based on self-reports, which are subjective, inefficient, and also prone to error. Technological approaches are therefore emerging to provide objective dietary assessments. Using only egocentric dietary intake videos, this work aims to provide accurate estimation on individual dietary intake through recognizing consumed food items and counting the number of bites taken. This is different from previous studies that rely on inertial sensing to count bites, and also previous studies that only recognize visible food items but not consumed ones. As a subject may not consume all food items visible in a meal, recognizing those consumed food items is more valuable. A new dataset that has 1,022 dietary intake video clips was constructed to validate our concept of bite counting and consumed food item recognition from egocentric videos. 12 subjects participated and 52 meals were captured. A total of 66 unique food items, including food ingredients and drinks, were labelled in the dataset along with a total of 2,039 labelled bites. Deep neural networks were used to perform bite counting and food item recognition in an end-to-end manner. Experiments have shown that counting bites directly from video clips can reach 74.15% top-1 accuracy (classifying between 0-4 bites in 20-second clips), and a MSE value of 0.312 (when using regression). Our experiments on video-based food recognition also show that recognizing consumed food items is indeed harder than recognizing visible ones, with a drop of 25% in F1 score.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.