Management of diabetic patient profiles using ontologies

In 2019 the International Diabetes Federation estimated that 12.8 million Mexicans had diabetes. The diabetes epidemic ranked second in causes of death in Mexico, a situation that was severely complicated during the second quarter of 2020 with the COVID-19 pandemic. Studies carried out by the Ministry of Public Health showed that the comorbidity of diabetes with COVID-19 has become a risk factor for serious complications, increasing the mortality rate. For this reason, it is necessary to develop personalized information management systems to support medical decision-making considering the specific characteristics of patients in Mexico. Information management of the diabetic patient profile begins with the investigation and registration of the relevant clinical data, data used by the physician to make the diagnosis and determine a personalized treatment. This article reports the development and integration of an ontology model for the management of diabetic patient profiles, incorporating medical ontologies. The results of the evaluation show the feasibility of using this integrated ontology for the management of diabetic patient presenting comorbidities. Likewise, a consistent ontological model is achieved, which complies with extensibility and reusability quality characteristics. JEL code: I18, D83, C88, C63, C55


Introduction
The first case of COVID-19 pandemic was reported in Wuhan China on December 19 th , 2019. COVID-19 is a disease caused by the SARS-CoV-2 virus. During the first weeks of the pandemic, Diabetes Mellitus rapidly emerged as the main comorbidity of patients who developed serious complications or death. According with the World Health Organization (WHO), diabetes is a chronic disease that appears when the pancreas does not produce enough insulin or when the body does not effectively use the insulin it produces. The Standards of Medical Care in Diabetes (2019) state that the effect of uncontrolled diabetes is hyperglycemia. Type 2 diabetes has its origin in the body's inability to effectively use insulin, which is often the result of excess weight or physical inactivity. Regarding the data about diabetes in Mexico, in 2019 the International Diabetes Federation 1 estimated that 12.8 million Mexicans had diabetes (IDF, 2019). Regarding causes of death, data reported by INEGI in 2018 showed that diabetes mellitus represents the second cause of mortality in Mexico with 106,525 cases. Similarly, the National Institute of Public Health (INSP) notes that more than 74,000 people die every year due to diabetes and its complications in Mexico. Another important risk factor is obesity. INSP indicates that overweight and obesity are health problems precursors of diabetes; in Mexico, 68% of the population over 20 years of age is overweight and obese. According to (Denova-Gutiérrez, 2020), 17.4% of Mexicans diagnosed with COVID-19 had obesity, 14.5% diabetes, 18.9% hypertension and 2.8% had cardiovascular disease.
This research project focuses on Type 2 Diabetes Mellitus (T2DM), which has several therapeutic options. The purpose of this project is to develop an ontological model for the management of patient profiles with T2DM. This ontological model will serve as a broad knowledge base of the clinical information related with the diagnosis and medical treatment of diabetes mellitus and other diseases. The most relevant information considered for the diagnosis and personalized medical treatment is the patient data, physical exercise, body mass index, basal metabolism rate, symptoms, laboratory tests, medication and diseases. The risks faced in the treatment of diabetes derived from the selection of medications are: hyperglycemia, hypoglycemia, kidney disease, retinopathy, gastroparesis, and sexual dysfunction.
With regard to treatment, not all patients with T2DM will respond equally to the same treatment, the reason could be that the doctor does not consider all the specific characteristics of the patient. The treatment that works best for most patients is selected first, even if this treatment is not effective for "some" patients. According to (Lasierra, 2013) there are no identical patients, each requires their own treatment according to their chronic conditions. In fact, it is very 1 https://idf.org/ common that patients who have a chronic disease suffer from another chronic condition (comorbidity). Personalized medical treatment is a relevant and emerging approach that involves the design of strategies to prevent, detect, treat and monitor patients individually according to their complete medical profile (Inzucchi, 2012) (Hempo, 2015) (El-Sappagh, 2018).
The use of ontologies for modelling profiles of patients with T2DM will allow the detailed characterization of all the factors involved in diagnosis and management, achieving a high degree of semantic expressiveness by using description logics, and the incorporation of inference rules based on medical knowledge. A revision of existing ontologies and related sources of information was conducted to define the most relevant patient profile characteristics, attributes and semantic relations that need to be included in the ontological model. Additionally, data required for diagnosis and treatment will be included: patient height, weight, age, BMI, Basal Metabolic Rate, daily caloric intake, physical exercise, symptoms, laboratory tests, medications, and disease. In order to evaluate the usefulness and applicability of the ontology model two study cases are presented and the quality design criteria are explained.
The main contribution of this research is the diabetic patient profile ontology, which incorporates knowledge bases from valid medical resources and quality design criteria. The semantic approach derived from ontologies can be effectively exploited in personalized management services according to the patient, since it confers an inference capacity on the information, making use of logic-based languages. The rest of the paper is organized as follows: Section 2 presents a revision of related ontologies with diabetes mellitus, Section 3 describes the ontology development methodology, Section 4 presents the global ontology integration, Section 5 describes the evaluation, and finally in Section 6 conclusions are presented.

Revision of Related Ontologies
The use of ontologies to support diagnosis and treatment of diabetes mellitus is not a new issue, many researchers have investigated the benefits of logic-based reasoning approaches to automate diagnosis and support Clinical Decision Support Systems (CDSS). In this section we present a revision of ontologies that address the representation of health care support for diabetes mellitus.
In (Paganelli, 2007) authors described an ontology-based context model, and a context management middleware to monitor and assist patients at home. The model consists of four ontologies: Patient Personal Domain Ontology, Home Domain Ontology, Alarm Domain Ontology, and Social Context Ontology. Additionally, authors presented a Web application for manual input of biomedical parameters: heart rate frequency, pulse oximetry, systolic and diastolic blood pressure, body temperature, and glycemia.
In (Buranarach, 2011) presented an ontology for Type II diabetes and a clinical support system. The clinical support system consists of an information system and a reminding system. The ontology concepts included are: patient card, person data, family history, signs, symptoms, and health status (diabetes and diabetes complications).
In (Chen, 2012) authors presented a recommendation system for diabetes medication; this recommendation system was built using WSRL rules and Jess Rule engine. The kind of recommendations are related with specific medication treatment depending of various input data: HbA1c levels, safety, tolerance, among others. They also included information about patient tests. However, this recommendation system lacks of other important information for drug recommendation, for instance, disease history, familiar disease history, physical activity, and diet.
(Lasierra, 2013) authors described an ontology-based solution to provide personalized care to chronic patients at home. They implemented a method in three stages: ontology design and implementation, ontology application study, and software prototype implementation. Of particular interest is the incorporation of physician's knowledge and clinical guidelines to represent patient profiles, which is a similar approach to the work reported in this paper. The ontology incorporated biological measurements: weight, height, glucose level, peripheral capillary oxygen saturation (SpO2), amount of air forced from lungs per second (FEV1), body water, body fat, blood pressure, and pulse rate.
In (Rahimi, 2014) authors presented a validation of the Diabetes Mellitus Ontology (DMO) using real world Electronic Health Records (EHR) data. They evaluated the accuracy of the ontology to execute inferences regarding the detection of the DMT2. The sources of information utilized for the experimentation were taken from a literature review, the Australian National Guidelines for Type 2 Diabetes Mellitus, and 908 real world Electronic Health Records (HER). (Hempo, 2015) authors presented a personalized care recommendation approach for diabetic patients, using an ontology model that consists of three ontologies: one that describes the particularities of patient profiles, a second ontology for the description of complications of diabetes, and a third ontology that describes self-care practices focusing on food and exercise recommendations.
In (El-Sappagh, 2016) authors described the Diabetes mellitus Diagnosis Ontology (DDO), which was constructed following the principles of the Open Biomedical Ontologies (OBO) Foundry; authors also specialized the Basic Formal Ontology (BFO) upper level ontology; and reused the Ontology for General Medical Science (OGMS). The main classes included in DDO are: diabetic complication (disease), drug, laboratory test, physical examination, and diabetes symptom. This ontology model does not consider treatment plans or medicaments, and family history. Sherimon and Krishnan (2016) describe OntoDiabetic, an ontology-based decision support system to asses the risk factors and generate treatment suggestions for diabetic patients. Authors define a set of inference rules to derive information about the health status of the patient. This ontology model lacks up-to-date information on drugs, diseases, and important relationships between them.
In (El-Sappagh, 2018) authors described the development of Diabetes Mellitus Treatment Ontology (DMTO) following the design principles of the OBO Foundry, reusing the OGMS ontology, and incorporating the DDO previously defined concepts. The novel inclusions in DMTO are: interactions between drugs, interactions between drugs and complications, interactions between drugs and food, and interactions between drugs and exercise. The main goal of DMTO is to support the automation of diabetes treatment process and provide an intelligent and distributed clinical decision support system to be integrated in an Electronic Health Record (EHR): In (Chen, 2019) authors describe an Ontology-based Model for Diabetic Patients (OMDP). In OMDP authors specialized the concepts from the BFO and reused the OGMS ontology concepts. OMDP consists of the following main definitions: Gene, Laboratory test, Diagnosis, Treatment, Diet, Exercise, Drug, Complications, Physical examination, Symptoms, and Patient information.
The review of related ontologies is done with the main objective of reusing one or more models, either complete or partial. Therefore, the breadth of concepts they cover regarding the diagnosis and treatment of diabetes mellitus must be observed first. From this perspective, we consider that the ontologies DMTO and OMDP are the most complete because they include treatment plans. However, the reuse of a complete model must be carefully decided. DMTO and OMDP are complex models that extend and specialize from BFO and OGMS ontologies. Therefore, these ontologies import several models and handle many naming conventions that require the ontology designer to become familiar with these rules and regulatory principles of the Open Biomedical Ontology (OBO) Foundry.
The main difficulties found in reusing of DMTO and OMDP ontologies is that there are no reports of current applications that use them. This is an indicator of the following design problems: lack of intelligibility, very large and complex models, performance problems related to memory resources for reasoning and inference execution when dealing with hundreds of patient profiles. Literature reporting these ontologies mention nothing about performance measures regarding large volumes of patient data. Therefore, in this work we have decided to carry out an hybrid ontology design which consists of two phases: design and develop from scratch the patient profile model; and select, modularize and integrate existing medical ontologies that are usable, clear and valid. The integration procedure of existing resources consists of extracting modules in the form of vocabularies and integrating them as lighter modules, thus avoiding performance issues.

Ontology Design Method
Taking as a reference the methodology reported in (Bravo, 2019) we designed the ontology model that is reported in this section. The process of designing an ontology begins by establishing the domain and competence of the ontology.

Ontology Requirements Specification
This ontology model aims at representing the knowledge base for DMT2 diagnosis, and treatment. DMT2 diagnosis and treatment depend mainly on the particular characteristics of the patient, therefore this ontology will consider the acquisition of the patient data. In order to specify the initial requirements of the ontology model, it is necessary to define a list of competency questions that the ontology model should be capable of answering.
Regarding the patient profile, the following competency questions should be answered:

a)
What is the basic data that characterizes the diabetic patient profile? There are other important questions that an ontology model of DMT2 should answer; however, in this article we concentrate on the representation of the diabetic patient profile, since this represents a crucial entry point for the entire process of diagnosis and subsequent treatment.

Ontology Design
The process of designing an ontology consists of making decisions about the concepts to be included into the model and the relationships between them. The type of relationships that are to be implemented are hierarchical relations (is-a) and semantic relations (object and data properties). Based on the competency questions specified above, and the revision of related ontologies, the following terms were identified as the core concepts involved in diagnosis and treatment of DMT2: Patient Profile, Demographic Data, Physical Activity, BMI Classification, Disease, Drug (Medication), Lab Test, and Clinical Diagnosis among others. Figure 1 shows the main concepts included in the ontology model. only the data necessary for the establishment of the state of health are obtained with the express authorization of the participants, the identity of the patients is never known. However, it is necessary to have a way to identify the patient and his / her medical and biomedical data, therefore a username and password are requested as identifiers to access to the system. During the process of deciding which were the most important characteristics to represent a patient, the need to include the data that allowed calculating the metabolism data was observed. It is relevant for the development of this project to calculate the basal metabolic rate and the body mass index. According with the Center for Disease Control and Prevention (CDC) 2 the Body Mass Index (BMI) represents the weight of a person in kilograms divided by the square of height in meters. Likewise, the CDC indicates that a high BMI can be an indicator of high body fatness. Therefore, BMI calculation can be used to filter out categories of body weight that may lead to possible health problems. In (Pelley, 2012) the Basal Metabolic Rate (BMR) is defined as the rate of energy expenditure of a person at rest; it eliminates the variable effect of physical activity.
The BMIClassification concept was defined based on the World Health Organization (WHO) 3 definition, with the objective of identifying the patient BMI class of body weight. BMIClassification is defined with three data type attributes: hasMinValue, hasMaxValue and hasNameBMI. Table 2 shows the particular ranges of each individual. Figure 3 shows the semantic relationship that enables the correlation of each Patient individual with its weight classification. During Patient registration, the basalMetabolism and bodyMassIndex values are calculated, also a semantic relation is defined using the hasBMIClassification object property to associate every patient with his weight class. According with a position statement of the American Diabetes Association (Colberg et al., 2016) adoption and maintenance of physical activity is a relevant aspect for the prevention and treatment of diabetes mellitus. Therefore, it is important to obtain this information from the patient during registration. For the definition of the concept PhysicalActivity the following data were considered: the frequency and type of exercise that is performed. For the type of exercise we defined a specific list of physical activities that the patient can perform routinely, and the approximate number of calories per hour associated, so that it is possible to make an approximate calculation of the number of calories burned. Figure 4 shows the semantic relationship used to define the type of physical activity that the patient performs on a periodical basis.
The physical activity also defines a numerical value to represent the number of calories that are burn per hour.

Figure 4. Semantic relationships between Patient and PhysicalActivity
Source: Author´s own The purpose of this research project is to develop a comprehensive infrastructure to facilitate the integration and development of applications for remote monitoring of diabetic patients. In this sense, it is important to study other factors that affect DMT2. In (Ansari, 2019) authors presented a conceptual model of the aspects for the self-management of diabetes, which includes: socio-demographic characteristics, such as age, gender, education; behavioral and psychological characteristics: monitoring of blood sugar, adherence to diet, and physical activity; barriers to self-management: lack of knowledge, self-confidence, financial, and family support; and cultural characteristics: cultural beliefs, and dietary preferences. This article describes a prototype that does not cover exhaustively all the aforementioned aspects; however, the model is possible to be augmented easily. For now, the information regarding the state, province or municipality where the patient was born and lives are included. Figure 5 shows the semantic relationship between the Patient and the Municipality. The semantic relationship of personBornInMunicipality is used to describe the specific municipality where the patient was born, also personLivesIn is used to indicate the current address of the patient. This information will be used to correlate the patient profile with his social context.

Integrate Medical Ontologies
Based on the list of required concepts related with the diagnosis and treatment of diabetes, a search was made for existing ontological resources to be reused in this model. Specifically, the following concepts were searched: disease, drug or medication, symptom, and laboratory tests. A good repository of biomedical ontologies is BioPortal 4 , which provides search mechanisms to find and explore among a large collection of biomedical ontologies. We executed the search of the afore-mentioned concepts, Table 3 shows the list of ontologies found for each concept.

Source: Author´s own
Search results returned a large list of ontologies that matched the concepts; therefore the selection of the ontologies was made based on the intelligibility and usability criteria. That is, that the ontologies that were clear, legible and usable were selected. In this section we describe the specific ontologies selected and integrated into the Patient Profile ontology model.

a)
The National Drug File -Reference Terminology (NDF-RT) is a controlled medical terminology developed by the Department of Veterans Affairs Veterans Health Administration (VHA). The NDF-RT provides a formal model to describe medications, includes the FDA Established Pharmacologic Class(EPC), a set of Mechanisms of Action (MoA), Physiological Effects, Therapeutic Categories, and cross references to other important vocabularies such as the RxNorm 5 provided by the National Library of Medicine (see Figure 6). Figure 6. The National Drug File -Reference Terminology (NDF-RT) ontology reference. Source: Author´s own NDF-RT ontology was selected because it provides a revised and valid vocabulary of medicaments with relevant information such as the physiological effect, the chemical ingredients, the mechanism of action, the diseases that each medicament may prevent, may diagnose or may treat. This latter information is crucial to correlate pharmacological treatments with diseases.
b) The Human Disease Ontology (DOID) 6 is a standardized reference that provides human disease terms, phenotype characteristics and medical vocabulary. DOID integrates mappings with important medical vocabularies: MeSH, ICD, NCI´s thesaurus, SNOMED and OMIM. The DOID ontology was selected because it represents disease terms correlated with several important vocabularies. d) The Logical Observation Identifiers Names and Codes (LOINC) 8 system. The LOINC ontology provides standardized vocabularies for laboratory terms. LOINC is currently maintained by the Regenstrief Institute, which in 1994 initiated the LOINC project to address the lack of standardization with multiple laboratories that used different codes for different test observations. LOINC provides a standard way of identifying observations using approximately 41,000 observation terms. 31,000 of these terms are used for laboratory testing.
To determine the ontologies that could be reused and imported into the model, we considered that the importance of each ontology or vocabulary lies mainly in the validity of the knowledge represented. Therefore, we have selected the vocabularies from institutions and groups of researchers that meet the requirements of the OBO Foundry to publish their ontologies in the BioPortal. Additionally, the intelligibility and ease of use of each vocabulary were considered for reutilization and integration.

Global Ontology Integration
Once the design and implementation of the core concepts have been completed, and having modularized the ontologies for reuse, the integration process consisted of the following steps: integrate core concepts of the Patient Profile in a single ontology file named PatientElectronicRecord.owl; select and import modularized ontologies and concepts to fulfill the requirements for the Patient Profile modeling; define the global semantic relationships (object properties) between the core concepts and the imported concepts that are necessary to meet the requirements of the ontology; and evaluate the resulting integrated ontology by representing case studies, and executing the logical consistency.

Definition of Global Semantic Relations
The list of global semantic relations is shown in Table 4, specifying the domain and range for each. This relationship is used to describe how a medicament produces an effect in the body. For example, a drug's mechanism of action could be how it affects a specific target in a cell, such as an enzyme, or a cell function, such as cell growth. hasPhysiologicalEffect

Medicament PhysiologicalEffect
It is used to establish the biochemical and physiological effects of drugs and their mechanisms of action and the relationship between the concentration of the drug and its effect on an organism.

Clinical Diagnosis
According with the National Cancer Institute (NCI) Clinical Diagnosis is defined as the process of identifying a disease, condition, or injury based on the signs and symptoms of a patient, and the patient's health history and physical exam. In order to represent this concept, we included a class Clinical Diagnosis and its semantic relationships with the patient's profile (shown in Figure 7).

Evaluation
In this section we present the general metrics of the resulting ontology model; three study cases to evaluate the management patient profiles, and Coherence of the entire ontology model through consistency checking.

Ontology Metrics
The resulting integrated ontology consists of 281097 axioms, with a total of 92 classes (or concepts), 20 object properties, 45 data properties, and 65501 individuals. Figure 8 shows the general metrics of the integrated ontology model, and the set of external imported ontology modules.

Study Cases
Three study cases were implemented using the ontology model ( Figure 9 shows the details).

Ontology Consistency Checking
Ontology development and evaluation require the verification of Consistency of the model. Reasoning with inconsistent ontologies may lead to erroneous conclusions. The verification of consistency of an ontology is the task carried out by the reasoner program that contrasts what is formally defined in the T-Box with respect to what is asserted in the A-Box. If inconsistencies are found, the reasoner will generate explanations of the inconsistencies. In the case of the ontological model presented in this article, the verification of logical consistency is of particular interest, since modules from external ontologies and vocabularies are imported, which can unintentionally cause the occurrence of logical inconsistencies.
Ontology consistency is an important evaluation criteria, because it states that the entire integrated ontology is consistent and has the possibility to execute inference rules to find new concepts and execute queries over the concepts. Figure 10 shows the result of executing the reasoner over the integrated model. During the execution of the reasoner, Pellet executes a tableaux reasoner that has only one functionality: checking the consistency of an ontology.
According with (Sirin, 2007) an ontology is consistent if there is an interpretation that satisfies all the facts and axioms in the ontology. The tableaux algorithm constructs a graph-based representation of the A-Box, where each node is associated with its corresponding type, all property-value assertions are represented as directed edges between nodes. The reasoner repeatedly applies expansion rules until a contradiction is detected. All other reasoning tasks are defined in terms of consistency checking. Therefore, inference about the class hierarchy, the object property hierarchy, and the data property hierarchy derives all non-explicit superclass-subclass relationships which exist in these hierarchies. Regarding the inference about assertions of classes, object properties, and individuals, the reasoner determines if there are no inconsistencies between what is established as class axioms and what is instantiated in the A-Box. From a system management perspective, the results of the reasoner shown in Figure 10 means that user applications or software agents will be able to ask questions and obtain logically correct answers, as well as produce inferences about the facts established in the ontology, facilitating decision making process. Figure 10. Ontology Consistency Checking. Source: Author´s own Gruber (1983) introduced the definition of Coherence design principle as follows: "an ontology should sanction inferences that are consistent with the definitions". Based on the results obtained by the reasoner, after integration of ontological models, and some case scenarios it is possible to state the resulting ontology is Coherent.

Conclusions
In this work we have presented an ontological model that integrates two fundamental concepts: the patient's profile and the clinical diagnosis. The model was designed and built considering relevant information requirements for medical decision making. With this ontological model, it was possible to integrate a broad knowledge base that includes medical information from existing valid resources such as DOID, NDF-RT, SYMP, and LOINC ontology references.
The ontology model has been designed following important quality criteria, such as extendibility and reusability.
Extensibility was achieved by designing several separate ontologies. For example, the disease ontology is integrated in a single separate owl file, within this ontology the keys and references of the diseases obtained from the original DOID ontology are incorporated. The same criteria were followed for the ontologies of drugs, symptoms and laboratory tests.
Thus achieving a modular design, which allows each ontology to be updated separately without affecting the main design of the ontology that integrates the patient's profile and the clinical diagnosis. The reusability of the patient profile is achieved as a consequence of the modular and lightweight design. The patient profile can be reused in different scenarios and applications. Likewise, patient data can be used to perform data mining and find patterns that allow recognizing risk groups by categories, be it gender, age, social context, etc.
Study cases showed the feasibility of the model to be used as a broad knowledge base to support the management of patient profiles. The resulting ontology will be augmented by including more aspects such as management of diabetic patient based on physical activation and treatment plans based on an analysis of the mental and psychological state of the patient. Likewise, an extension will be made of the patients' data on their family history. The long-range purpose is to develop remote patient follow-up and recommendation applications.