Publication Date

Spring 2015

Document Type

Project Summary

Degree Name

Master of Science


Computer Science

First Advisor

(Clare) Xueqing Tang, Ph.D.

Second Advisor

Soon-Ok Park, Ph.D.

Third Advisor

Kong-Cheng Wong, Ph.D.


The objective of this project is to collect Click Stream data of USA Government websites which is high in volume and velocity, and store it for analysis in a cost effective manner for enhanced insight and decision making. I expect to learn how to process this data in an engineer’s way. I have plenty of tools in my hand like map reduce, pig, streaming and many more. But for a given business case it is very important to know which tools should be used to achieve the objective. In brief this is what I expect to learn.

The Hadoop-ecosystem, State-of-the-art in Big Data age is perfectly suitable for click stream data analysis. To achieve the objective mentioned, it is very much necessary to have scalable systems at low cost which can operate at great speeds and bring out wonderful insights. Perfect answer for this is Hadoop.


Co-authored capstone with authors listed in alphabetical order by OPUS staff.