The initial ontological curation identified a large number of relevant terms to consider. The terms were then either imported from existing ontologies, redefined from existing concepts, or annotated de novo. By merging 3334 terms imported from already existing ontologies and 100 newly defined terms, the ONS describes both intervention and observational studies in nutrition.
Central nutritional concepts
In the ONS, relevant nutritional concepts have been related to each other to offer a well-organized synopsis of the knowledge in health and nutrition sciences. The ONS harmonizes all pertinent concepts from different domains, defining appropriate relationships and improving and simplifying the process of conceptual organization of the many facets of real studies. Here, we present (Fig. 1) how diet, food, and food component concepts, which can be considered central for an ontology aimed at effectively assisting researchers in the standardized description of the nutritional study they are conducting, were included, defined, and connected in the ONS.
Diet is defined as the regular course of eating and drinking adopted by a person or animal (ONS_0000080). For the purpose of the nutritional community, we further detailed the diet concept into three sub-classes: (i) Usual diet is defined as the regular course of eating and drinking adopted by a population in a certain geographical area, or in a certain cultural setting, or following certain common eating behavior. It is also intended as the diet a person would follow without further prescription or indications, i.e., vegetarian diet (ONS_0000083). (ii) Prescribed diet is defined as a diet prescribed by a physician/nutritionist to meet specific nutritional needs of a person (ONS_0000082). (iii) Intervention diet is defined as the diet administered during an intervention study. It usually comprises the adoption of a certain nutritional intervention (ERO_0000347), intended as the prescription of consuming or not consuming certain food, and follows a precise study design. Intervention studies usually compare at least two subgroups of a population, one control group receiving a null nutritional intervention and one or more test groups receiving the intervention (ONS_0000081).
Food component is defined as any substance that is distributed in foodstuffs. It includes materials derived from plants or animals, such as vitamins or minerals, as well as environmental contaminants (CHEBI_78295, ONS_0000073). Starting from this definition, we further detailed the food component concept into different sub-classes: (i) Nutrient (ONS_0000077): A nutrient is a food component used by the body for normal physiological functions that guarantee survival and growth. It must be supplied in adequate and defined amounts from foods consumed within a diet. Malnutrition occurs when the right amount of nutrient is not provided. (ii) Food bioactive (ONS_0000076): A food bioactive is a food component other than those needed to meet basic human nutritional needs (nutrients). Food bioactives modulate one or more metabolic processes, possibly resulting in the promotion of better health. The daily required intake for food bioactives is not established yet, and there is no demonstration that malnutrition occurs when the right amount is not provided. (iii) Contaminant: Contaminant is unwanted food component that makes the food no longer suitable for use (ONS_0000075). (iv) Additive: Additive is a component added to food to improve or preserve it (ONS_0000074).
Multiple definitions can be found for the food concept. As an example, CHEBI (CHEBI:33290) defines “Any material that can be ingested by an organism” and MESH (MeSH D005502) defines “Any substances taken in by the body that provides nourishment.” For the purposes of the nutritional community, the concept of food was expanded as food is defined as a complex matrix that is consumed by a person through the process of eating or drinking (ONS_0000079). Foods are bearer of the nutrients, bioactives, and, sometimes, other food components. Food consumption, through the meal consumption, follows a certain dietary pattern, which define the diet. Nutrients and bioactives contained in food can be exploited by the human organism thanks to the process of digestion (ONS_0000101), absorption (ONS_0000102), metabolization (ONS_0000103), or through the intervention of the gut microflora (OHMI_0000020). The concept of food can be split into the following: (i) Raw food: A raw food is an uncooked, unprocessed food that is consumed in its natural state (ONS_0000099); (ii) Processed food: A processed food is the result of the process of home or industrial food preparation (ONS_0000100).
In nutritional science, biomarkers are increasingly being used to provide objective results and to avoid biases (e.g., reporting bias and recall bias). Three groups of biomarkers were identified for use in nutrition science [30], along with the dietary biomarker development framework: “exposure biomarker” for dietary intake and nutrient status, “effect biomarker” for measuring biological effects of food components, and “susceptibility biomarker” for assessing the effects of diet on human health. In the ONS, we are presenting the first formal ontology application for the biomarker class (ONS_0000095) and its sub-classes, using the definition from the commentary [30]. ONTOBEE query for the “biomarker” returned multiple results mainly from the Experimental Factor Ontology (EFO), all having the class “Measurement” (EFO_0001444) as super-class (a measurement is an information entity that is a recording of the output of a measurement such as produced by an instrument). However, it has to be noted that a similar class can also be found in the Information Artifact Ontology (IAO) named “Measurement datum” (IAO_0000109, a measurement datum is an information content entity that is a recording of the output of a measurement such as produced by a device). In the ONS, the biomarker class was defined as a sub-class of the “Measurement datum” class (IAO_0000109) in line with the OBI ontology, which uses the IAO class.
Integrated analysis of data and joint pooled analysis are strongly promoted in nutrition by research funders, though raise scientists’ concern, as the scientific interest in the open access to nutritional data often conflicts with the General Data Protection Regulation. When fully achieved, integrated analysis will lead to new discoveries and maximize use of public funds. In ENPADASI, this problem was broadly dealt with from both legal and technical aspects, and a recommendation on minimal information to be added as metadata to studies to boost integration capacity has been developed [19]. The identification of minimal requirements, essential to connect existing and future study (meta) databases, facilitates data exchange and data interpretation, helping to increase the robustness of results from future joint data analysis in nutritional epidemiology [31]. In fact, joint data analysis has already started helping to achieve new discoveries [32]. In the ONS, we have included the minimal required study information in the growing conceptual/ontological framework. Each minimal required study term was placed at the appropriate hierarchical level in the ontology. To easily identify terms pertaining to the minimal study information, an annotation property (“in_minimal_requirements_subset”) was created.
Application scenarios
The ONS is designed to enable the description of both intervention and observational studies in human nutrition. Here, we present two application scenarios based on published nutritional studies, one for the observational study design and one for the interventional study design. Figures 2 and 3 illustrate how the ONS was built to support the standardized annotation of most descriptors of a nutritional study, starting from initial phases of a study (i.e., formalizing the definition of population stratum) to finally connect to the specific results and how they were obtained. Figures and descriptions have to be intended at the single instance level (i.e., specific for the study object of description). For this reason, we introduced the use of individuals (and their connections) for very study-specific element alongside concepts in classes. In the text below, the italic notation indicates the properties, while the notation PREFIX:CLASS is used to indicate classes in the ontology, for example the notation “ONS:Diet” indicates the class with label “Diet” in the ONS ontology. For abbreviation of the ontologies, we refer the reader to the list of imported ontologies in the “Methods” section.
Observational studies
The first application scenario is represented by the CHANCE study [33]. Figure 2 illustrates how the ONS can be used to formalize information on how the study was conducted. This observational study aims at developing novel and affordable nutritious foods to optimize the diet and reduce the risk of diet-related diseases among groups at risk of poverty (ROP). The CHANCE study uses two different approaches to draw its final conclusion. The first is a literature search process (EDAM:Literature search), performed with a specific textual literature database query (i.e., an instance of the class ONS:Literature database query). Output of the literature search process is a number of scientific publications (IAO:Scientific publication) which are subject to analysis and review to extract data (OBCS:data collection from literature), a process that ultimately results in an organized data matrix (OBCS:Data matrix). CHANCE also included an observational study approach. In this case, a population was firstly divided into sub-populations based on their economic income. This stratification (STATO:Population stratification prior to sampling) was carried out following a specific stratification rule (STATO:Stratification rule), based on the risk of poverty (ROP) of the subjects assessed with a questionnaire (ONS:Income assessment). The stratified population was then challenged with (i.e., is specified input of) two nutritional questionnaires (ONS:Food frequency and ONS:Food diary) aimed at assessing the foods consumed by the subjects and producing results finally organized in a data matrix. In both cases, the data matrices (OBCS:Data matrix) specific for this study contain information about the nutrients and food consumed by the population and represent the specified data object on which conclusions are drawn (OBI:drawing a conclusion based on data).
Intervention studies
The second application scenario is represented by the FLAVURS (impact of increasing doses of flavonoid-rich and flavonoid-poor fruit and vegetables on cardiovascular risk factors in an ‘at risk’ group) study [34]. Figure 3 illustrates how the ONS can be used to formalize the information on how the study was conducted. This interventional study aimed to investigate the effects of high and low flavonoid diets on the vascular function and other cardiovascular disease risk factors. In this study, a population, selected on the basis of the stratification rule (STATO:Stratification rule) of having a relative risk of developing cardiovascular disease higher than 1.5, has been randomly divided (OBI:Group randomization and OBI:Randomized group participant role) into three groups: control group (CT), high flavonoid group (HF), and low flavonoid group (LF). Each of the groups was challenged with a different diet (ONS:Diet): CT followed the usual diet (ONS:Usual Diet), which is defined to have exactly 0 interventions (ERO:Intervention); in the HF and the LF groups, individuals were challenged with two different types of intervention diet (ONS:Intervention diet) encompassing two different intervention (ERO:Intervention) protocols. In HF diet, the intervention was performed by the prescription of consuming fruit and vegetables with high flavonoid content, while in the LF diet the intervention was concretized by the prescription of consuming fruit and vegetables with low flavonoid content.
Urine and blood (OBI:Urine specimen and OBI:Blood specimen) were collected from individuals (OBI:Collecting specimen from organism) and analyzed (i.e., they inherited the evaluant role OBI:Evaluant role) by an HPLC assay (HPLC class) including untargeted metabolomics [35]. Output of the analysis was a data item in the form of a matrix (OBCS:Transformed data item) that is used to draw specific FLAVURS conclusions (OBI:Drawing a conclusion based on data and OBI:conclusion based on data).