Introduction to Machine Learning

A Deep Learning Approach

CSI 5325 | Spring 2021

Comic Image
Comic Image
Comic Image
Comic Image
Comic Image
Comic Image
Comic Image

Go to schedule now

Canvas Shell

Course Description

This course is an introduction to the major problems, techniques, and issues of learning from data. Emphasis is placed upon the topics of machine learning and problem solving. This is a field rapidly growing in which we create models for computers to learn how to make inferences, or make decisions, based on data all around us and even in its absence. The Python language will be used to illustrate various machine learning techniques.

Prerequisites:

This course requires proficiency in the basic areas of computer science, but it also makes use of other important subjects in the area of mathematics: probability, statistics, calculus, linear algebra, optimization, etc. Please make sure you feel confident in all these subjects prior to starting this course.

Note:

You will be submitting data to Kaggle. You will need to create an account upon registration for the course.

Credit Hours: 3


Not sure if this course is right for you?

To find out, look around Kaggle Competitions and try to see if any of the featured competitions pose problems that challenge your intellect and your desire for knowledge on how computer can solve such problems. If you are intrigued, this course might be right for you.

About Kaggle

It is the world's largest community of data scientists. They compete with each other to solve complex data science problems, and the top competitors are invited to work on the most interesting and sensitive business problems from some of the worlds biggest companies through competitions.

Kaggle provides cutting-edge data science results to companies of all sizes. They have a proven track-record of solving real-world problems across a diverse array of industries including life sciences, financial services, energy, information technology, and retail.

Professor's dream

Dr. Rivas dreams that at least one team of students from Baylor will join a Kaggle competition and perform extremely well. He wants his students to represent Baylor proudly among other scientists and students. Dr. Rivas wants Baylor to be known as one of the best schools of computer science and machine learning in the U.S. Are you in?

Textbooks

[LFD] Y. S. Abu-Mostafa, et.al, Learning From Data.
The printed book contains most of the material required for this course, the rest of the material has been provided online by the authors here. You are responsible for having all the material available to you.

[DL4B] P. Rivas Deep Learning for Beginners.
This book contains introductory deep learning models and code in Python. You may find this useful for starting your project earlier.

The professor also recommends the following books for further reference:

Other resources include:

Objectives

At the completion of this course, students will be able to:

Numbers in square brackets indicate the specific goals of the School of Engineering and Computer Science that are being fulfilled.

Schedule

The weekly coverage might change as it depends on the progress of the class. However, you must keep up with the reading assignments.

Week Date Content
1 1/17 - 1/23 Introduction to Artificial Intelligence and Machine Learning.
Reading assignment: [LFD] Chapter 1.1. [DL4B] Chapter 1.
Homework 0 and Semester Project assigned.
Watch the welcome and first lecture here.
2 1/24 - 1/30 The Science of Learning Algorithms.
Reading assignment: [LFD] Chapter 1.2., 1.3, and 1.4. [DL4B] Chapters 2 and 3.
Homework 1 assigned.
3 1/31 - 2/6 Generalization Bounds and The VC Dimension.
Reading assignment: [LFD] Chapter 2.1.
4 2/7 - 2/13 Achieving Generalization.
Reading assignment: [LFD] Chapter 2.2 and 2.3.
Homework 2 assigned.
Project proposal due.
Watch the ice lecture here.
5 2/14 - 2/20 Linear Models for Classification and Regression.
Reading assignment: [LFD] Chapter 3.1 and 3.2. [DL4B] Chapters 4 and 5.
Homework 3 assigned.
Watch the refugee lecture here.
6 2/21 - 2/27 Logistic Regression and The Z Space.
Reading assignment: [LFD] Chapter 3.3 and 3.4.
7 2/28 - 3/6 The Problem of Overfitting.
Reading assignment: [LFD] Chapter 4.1.
Midterm Review.
Homework 4 assigned.
3/5 Midterm Exam During Class.
8 3/7 - 3/13 Regularization and Model Validation.
Reading assignment: [LFD] Chapter 4.2. and 4.3
9 3/14 - 3/20 Nearest Neighbor, Radial Basis, and k-means.
Reading assignment: [LFD] Chapter 6.1., 6.2, and 6.3.
10 3/21 - 3/27 Traditional Neural Networks.
Reading assignment: [LFD] Chapter 7.1, 7.2, 7.3, 7.4, and 7.5. [DL4B] Chapter 6.
Homework 5 assigned.
11 3/28 - 4/3 State-of-The-Art Neural Networks.
Reading assignment: [LFD] Chapter 7.6 and [DL4B] Chapters 7, 8, and 11.
Project milestone due.
4/2 - 4/4 No class. Easter.
12 4/4 - 4/10 Support Vector Machines.
Reading assignment: [LFD] Chapter 8.
13 4/11 - 4/17 Variational Autoencoders.
Reading assignment: [DL4B] Chapter 9.
14 4/18 - 4/24 Recurrent Neural Networks.
Reading assignment: [DL4B] Chapter 13.
Final project writeup due.
Student Project Presentations.
15 4/25 - 5/1 Generative Adversarial Networks.
Reading assignment: [DL4B] Chapter 14.
Student Project Presentations.
5/3 Final Exam at 4:30pm.

Policies

Grade Distribution

Grades will be assigned based on the following breakdown:
Homework assignments: 40%
Final project: 20%
Midterm exam: 20%
Final exam: 20%

Letter Grade Distribution

Final letter grades will be assigned at the discretion of the instructor, but here is a minimum guideline for letter grades:
A: 100-95, A-: <95-90,
B+: <90-87, B: <87-83, B-: <83-80,

C+: <80-77, C: <77-73, C-: <73-70,
D+: <70-65, D: <65-60,

F: <60

Course Policies

General

Grades

Attendance and Absences


Academic Honesty

Introduction

In addition to skills and knowledge, Baylor University aims to teach students appropriate Ethical and Professional Standards of Conduct. The Academic Honesty Policy exists to inform students and Faculty of their obligations in upholding the highest standards of professional and ethical integrity. All student work is subject to the Academic Honesty Policy. Professional and Academic practice provides guidance about how to properly cite, reference, and attribute the intellectual property of others. Any attempt to deceive a faculty member or to help another student to do so will be considered a violation of this standard.

Instructor's Intended Purpose

The student's work must match the instructor's intended purpose for an assignment. While the instructor will establish the intent of an assignment, each student must clarify outstanding questions of that intent for a given assignment.

Unauthorized/Excessive Assistance

The student may not give or get any unauthorized or excessive assistance in the preparation of any work.

Authorship

The student must clearly establish authorship of a work. Referenced work must be clearly documented, cited, and attributed, regardless of media or distribution. Even in the case of work licensed as public domain or Copyleft, (See: http://creativecommons.org/) the student must provide attribution of that work in order to uphold the standards of intent and authorship.

Declaration

Online submission of, or placing one's name on an exam, assignment, or any course document is a statement of academic honor that the student has not received or given inappropriate assistance in completing it and that the student has complied with the Academic Honesty Policy in that work.

Consequences

An instructor may impose a sanction on the student that varies depending upon the instructor's evaluation of the nature and gravity of the offense. Possible sanctions include but are not limited to, the following: (1) Require the student to redo the assignment; (2) Require the student to complete another assignment; (3) Assign a grade of zero to the assignment; (4) Assign a final grade of "F" for the course; and (5) Notify the Dean of the School of Computer Science and Mathematics about the issue. A student may appeal these decisions according to the Academic Grievance Procedure. (See the relevant section in the Student Handbook.) Multiple violations of this policy will result in a referral to the Conduct Review Board for possible additional sanctions.

To Conclude

Dr. Rivas takes academic honesty very seriously, after all, he also teach Ethics. Many studies, including one by Sheilah Maramark and Mindi Barth Maline have suggested that "some students cheat because of ignorance, uncertainty, or confusion regarding what behaviors constitute dishonesty" (Maramark and Maline, Issues in Education: Academic Dishonesty Among College Students, U.S. Department of Education, Office of Research, August 1993, page 5). In an effort to reduce misunderstandings, here is a minimal list of activities that will be considered cheating in this class:

Data for Research Disclosure

Any and all results of in-class and out-of-class assignments and examinations are data sources for research and may be used in published research. All such use will always be anonymous.

About The Professor

Pablo Rivas is a senior member of the IEEE and ACM. His degrees are in computer science (B.S. '03), electrical engineering (M.S. '07), and electrical and computer engineering (Ph.D. '11 from the University of Texas at El Paso). Currently, He is an Assistant Professor of Computer Science at Baylor. Before that, he was at Marist College in New York.

At Marist College he had the opportunity to work on different aspects of machine learning, data science, big data, and large-scale pattern recognition. Perhaps you have heard on NPR about a project of Dr. Greg Hamerly where he was a collaborator on the detection of leukocoria (see leuko.net for more info), where they used deep learning and image-processing techniques, which he loves. Another recent research project originated after an internship at NASA Goddard Space Flight Center where he worked in the detection of a particular kind of atmospheric particle using different machine learning methods. He currently works to make that remote sensing project available on-line in real time.

In the past he worked in the industry as Software Engineer for about 8 years; thus, he is quite familiar with programming languages, particularly C++, Java, and Python, but he has used MATLAB in the past.

He has been recognized for his creativity and academic excellence; for instance, he received the IEEE Student Enterprise Award in 2007, and the Research Excellence Award from the Paul L. Foster Health Sciences School of Medicine of Texas Tech University in 2009. In 2011, he had the honor of being inducted to the International Honor Society Eta Kappa Nu (HKN).

When he is not having fun doing research or teaching, he also likes to play basketball, code, eat pizza with friends, or go to the movie theater with his beautiful wife Nancy. The pandemic has affected some of these activities.


How to Contact The Professor

Dr. Rivas' office number is 304 @ Cashion, and office hours are:

Office hours are online and you can join here.


He is glad to talk to students during and outside of office hours. If you can't attend office hours, please make an appointment for another time, or just stop by if you see the door open (unlikely during the pandemic). If you are going to stop by it is a good idea to check his schedule and call first to make sure he is not busy; the number is (254) 710-3385.


If extra help is needed, there are private tutors available at the student's expense at Baylor's Office of Academic Support Programs.


Note: Any student who needs learning accommodations should inform Dr. Rivas immediately at the beginning of the semester. The student is responsible for obtaining appropriate documentation and information regarding needed accommodations from the Office of Access and Learning Accommodation (OALA), available online here, and providing it to the professor early in the semester.


Some content taken from a syllabus by Greg Hamerly and used with permission.
Pablo Rivas © 2020. All rights reserved. | Valid HTML | Valid CSS