Academic Catalog

Data Science (DATA)

DATA 20000   - Introduction to Data Science  (3)  
This course provides an introduction to the field of data science. It examines problems that can be solved by data scientists. The course presents the data science methodology of solving problems, including data gathering, preprocessing, modeling, evaluation, and visualization. Students will use standard data science tools and be introduced to programming concepts.
DATA 23500   - Programming for Data Analysis  (3)  
Disciplines and industries are collecting increasing amounts of data to help guide their work. This course presents programming techniques for working with large data sets. It teaches computer programming from the perspective of developing tools to analyze data.
Prerequisite: (DATA 20000 (may be taken concurrently) or CPSC 20000 (may be taken concurrently) or ECEN 10000 (may be taken concurrently)) and (MATH 21500 (may be taken concurrently) or MATH 22000 (may be taken concurrently) or MATH 31500 (may be taken concurrently) or PSYC 30300 (may be taken concurrently) or BSAN 34900 (may be taken concurrently))  
DATA 30000   - Visualizing and Communicating Data Knowledge  (3)  
In this course, students will study effective communication of knowledge derived from data. The course also covers visualization of data for purposes of analysis and communication. Students will use standard software tools and programming libraries for visualization. The course will require writing technical reports that present the data science process and results. It also includes a discussion of ethical issues involved in data science.
Prerequisite: DATA 23500 (may be taken concurrently) or CPSC 21000 (may be taken concurrently)  
DATA 40000   - Big Data Systems  (3)  
This course covers the study of systems for storing and processing large datasets. Covered concepts include standard architectures for Big Data, use of common software frameworks, and applications to batch and real-type systems. Students will work on projects using Big Data technologies such as Hadoop, MapReduce, Hive, Spark or NoSQL databases.
Prerequisite: CPSC 21000 (may be taken concurrently) and CPSC 33000 (may be taken concurrently)  
DATA 47100   - Machine Learning  (3)  
This course studies programs that use experience for improving their performance at solving a variety of tasks such as classification, regression, or clustering. Topics include supervised and unsupervised learning, reinforcement learning, parametric and non-parametric methods, ensemble learning and introduction to computational learning theory. Students will learn how to evaluate the performance of machine learning methods and how to utilize the techniques in various applications.
Prerequisite: CPSC 21000 (may be taken concurrently) and (MATH 31000 (may be taken concurrently) or MATH 21000 (may be taken concurrently))  
DATA 47200   - Introduction to Data Mining  (3)  
An introduction to the concepts, techniques, and systems of data warehousing and data mining, including (1) design and implementation of data warehouse and on-line analytical processing (OLAP) systems, and (2) data mining concepts, methods, systems, implementations, and applications.
Prerequisite: MATH 21000 (may be taken concurrently) and CPSC 21000 (may be taken concurrently)  
DATA 49000   - Data Science Undergraduate Capstone Project  (3)  
In this course, students will work in teams to develop a data-driven solution for a real-world problem using data science methods, will document their work in a scholarly report, and present their methodology and results to faculty and peers. Students will identify appropriate project topics with help of the faculty, research appropriate current methods and technologies, then apply them to find a solution. The results will be presented in a form of a technical report and an oral presentation. Additionally, this course will cover topics in professional ethics, intellectual properties, privacy and professional communication.
Prerequisite: DATA 30000 (may be taken concurrently) and DATA 40000 (may be taken concurrently) and (DATA 47100 (may be taken concurrently) or DATA 47200 (may be taken concurrently))  
Program Restrictions: Must be enrolled in the following Program: Data Science.  
Class Restrictions: Must be in the following Class: Senior.  
Attributes: Advanced Writing, Experiential Learning Gen Ed