College

College of Engineering and Polymer Science

Date of Last Revision

2021-04-25 18:46:26

Major

Computer Science

Honors Course

Senior Honors Project in Computer Science

Number of Credits

3

Degree Name

Bachelor of Science in Computer Science

Date of Expected Graduation

Spring 2021

Abstract

Source code comment classification is an important problem for future machine learning solutions. In particular, supervised machine learning solutions that have largely subjective data labels but are difficult to obtain the labels for. Machine learning problems are problems largely because of a lack of data. In machine learning solutions, it is better to have a large amount of mediocre data than it is to have a small amount of good data. While the mediocre data might not produce the best accuracy, it produces the best results because there is much more to learn from the problem.

In this project, data was collected from student comment code in computer science classes. This data was then sorted based on various tools in order to create automated source code classification. Various data categorization and sorting methods were explored, ultimately resulting in a process where assigned letter grade was used as a sorting label. Using python, CommentLabeler, and SortAndUnique tools were developed in order to automate the manual source code labeling process. State retention and error checking were also features that were added to streamline the process further.

The most important takeaway from this experience was that the amount of data is much more important than quality. In fact, mediocre data will provide better results with regard to machine learning because there is room for improvement and it proves machine learning as a solution.

Research Sponsor

Dr. Michael L. Collard

First Reader

Yingcai Xiao

Second Reader

Zhong-Hui Duan

Honors Faculty Advisor

Zhong-Hui Duan

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.