Assignments for CS 506

Assignment 0: Add 2 numbers (Dummy Assignment)

Description: Assignment task: Write a python script that adds two numbers together and prints their sum to the command line.

GitHub Link: GitHub Link

Assignment 1: Data Collection and Analysis

Description: In this assignment, we as a class will collectively record a dataset with elevator arrival times on the ground floor in CDS. Using this dataset, we determined the best location to wait in order to minimize the expected walking distance to the next arriving elevator.

GitHub Link: GitHub Link

Assignment 2 KMeans Clustering Visualization Webpage

Description: In this assignment, you will develop an interactive web application that demonstrates the KMeans clustering algorithm using various initialization methods. This project allows you to explore the impact of different initialization strategies on the clustering outcome.

GitHub Link: GitHub Link

Assignment 3: Singular Value Decomposition

Description: In this assignment you will need to implement the Singular Value Decomposition (SVD) algorithm from scratch. You will also need to use the SVD to perform dimensionality reduction on a dataset, explore the effect of the number of dimensions on the performance of a classifier, and visualize your results for comparison and analysis.

GitHub Link: GitHub Link

Demo Link: GitHub Link

Assignment 4: Latent Semantic Analysis (LSA) Search Engine Webpage

Description: Objective In this assignment, you will develop an interactive web application that implements a basic search engine using Latent Semantic Analysis (LSA). The search engine will take a user’s query, perform LSA on a pool of documents, and return the top documents based on cosine similarity.

GitHub Link: GitHub Link

Demo Link: GitHub Link

Assignment 5: K-Nearest Neighbors Kaggle Competition

Description: In this assignment, you will implement a K-Nearest Neighbors (KNN) model from scratch to predict customer churn for a bank. Your goal is to identify customers who are likely to leave the bank based on historical data and submit your predictions in a mini Kaggle competition. You are provided with a dataset and a starter code to help you get started. Your task is to preprocess the data, implement KNN from scratch, train and evaluate the model, and tune its hyperparameters. Once your model is optimized, you will submit your predictions for ranking on Kaggle.

GitHub Link: GitHub Link

CS506 Midterm Fall 2024

Description: The goal of this competition is to predict the star rating associated with user reviews from Amazon Movie Reviews using the available features.

GitHub Link: GitHub Link

Assignment 6: Linear Regression

Description: Objective In this assignment, you'll explore the impact of changing parameters on linear regression. The goal is to create an interactive webpage to demonstrate how modifying these parameters affects regression results, especially when there is no actual relationship between X and Y. By tweaking these settings, you’ll observe how randomness can influence the slope and intercept in a regression model.

GitHub Link: GitHub Link

Demo Link: GitHub Link

Assignment 7: Hypothesis Testing and Confidence Intervals in Linear Regression

Description: Objective In this assignment, you’ll extend your previous work from Assignment 6 to include hypothesis testing and confidence intervals through simulations. You’ll enhance your interactive webpage to allow users to perform hypothesis tests on the slope or intercept of the regression line and generate confidence intervals based on simulations.

GitHub Link: GitHub Link

Demo Link: GitHub Link

Assignment 8: Logistic Regression

Description: In this assignment, you'll explore the effect of shifting clusters in a dataset on the parameters of a logistic regression model. You will implement parts of the code to: Generate datasets with shifted clusters. Fit a logistic regression model and extract parameters. Visualize the data, decision boundary, and logistic regression results. Analyze how these parameters change with increasing shift distances.

GitHub Link: GitHub Link

Demo Link: GitHub Link

Assignment 9: Neural Networks

Description: In this assignment, you will implement and analyze a simple neural network by visualizing its learned features, decision boundary, and gradients. The goal is to develop a deeper understanding of how a Feedforward Neural Network with one hidden layer operates and represents the input space during learning.

GitHub Link: GitHub Link

Assignment 10: Image Search

Description: Implement a simplified version of Google Image Search.

GitHub Link: GitHub Link

Demo Link: GitHub Link

CS506 Extra Credit (Kaggle Competition)

Description: In this competition, your task is to develop a predictive model to identify potentially fraudulent transactions using various machine learning algorithms. Fraud detection is a critical challenge in many industries, especially finance, where identifying fraudulent activities can save organizations from significant financial losses. By accurately predicting whether a transaction is fraudulent, institutions can take preventive measures to secure their operations.

GitHub Link: GitHub Link