Date of Last Revision

2023-05-03 05:05:11



Degree Name

Bachelor of Science

Date of Expected Graduation

Spring 2018


Big data is a term that has come to the spotlight for companies within recent years. Data analysis and business intelligence have become prominent sectors of companies and agencies. But what is big data? How has it impacted large companies and agencies? Why must it be embraced?

The best way to approach utilizing a big data set is to establish a question to answer. For this data set, the question that must be answered is “What variables cause a loss to occur?” To answer this question, first, we must understand what is meant by a “loss”, and take a look at what kind of data we are working with. The data for this project is live, or active, insurance data from National Interstate Insurance. National Interstate Insurance offered this “live” data set for this project as a way to get a head start on statistical analysis. This data set has only been analyzed for this project presently, and will be visited by data analysts in the future for further assessment.

The program used for this project is called SPSS. This tool is one of many used in companies to help draw up decision tree models to display data in an easy to navigate form. In this program, the decision trees are modeled by utilizing a feature that provides a few algorithmic options. These algorithms are known as CHAID and CART. Both algorithms result in some form of a decision tree displaying how variables impact the outcome.

Research Sponsor

Mark Fridline

First Reader

Richard Einsporn

Second Reader

Nao Mimoto



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.