Chronic Disease Prevention Program

  • Alona Kryshchenko California State University of Channel Islands
  • Cynthia Flores
  • Terrance Barroso
  • Antonio Hernandez
  • Nathalie Huerta
  • Angel Mora-Larscheid


As our population continues to grow, health professionals in the U.S. have a growing concern
for the current and future population related to diabetes mellitus. Diabetes is an underlying
disease that occurs when one’s blood sugar level is too high for a prolonged period of time.(1)
When untreated, short-term and long-term effects are detrimental. Acute complications include:
“diabetic ketoacidosis, hyperosmolar hyperglycemic state, or death.” (3) Moreover, the long-term
effects include: “cardiovascular disease, stroke, chronic kidney disease, foot ulcers, and damage to
the eyes.”
Diabetes is a growing epidemic causing health professionals to research prevention methods as
well as a way to diagnosis patients based on certain characteristics. As a result, the Chronic Disease
Prevention Program (CDPP) provides blood sugar testing in a non-traditional setting (e.g. grocery
stores, libraries, etc.). By using the CDPP data set and applying the tools of machine learning
we will predict whether someone is diabetic or requires additional testing. Machine learning is a
way to develop algorithms, allowing the computers to learn. The attributes that will be analyzed
in the data set are: BMI group, age, gender, blood sugar, self diabetes, and whether the testing
was done during fasting or randomly. These attributes were analyzed using Linear Regression
to learn more about the relationship between the response variable (i.e. blood sugar) and the
explanatory variable. Besides applying Linear Regression, we used Multiple Linear Regression as
well a K-Nearest Neighbors, and Decision Tree.

STEM Fields