Date of Award

2025

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Committee Chair

Tathagata Mukherjee

Committee Member

Jacob Hauenstein

Committee Member

Farbod Fahimi

Research Advisor

Tathagata Mukherjee

Subject(s)

Pattern recognition systems, Machine learning, Image processing--Digital techniques

Abstract

In this thesis, a framework for predicting future frames from videos of naturally occurring processes will be presented. This is achieved by learning the underlying physical laws of the natural process through the exploitation of spatiotemporal features from the frames of the video. Our framework defines windows containing frames at varying time intervals, which we call dilations, and uses them to predict future frames using a 3D convolutional neural network (CNN). The network architecture accepts six time-dilated windows of input video frames and computes spatiotemporal features for each window maintaining the exclusion of the resulting features across three phases of the network, finally aggregating the features at the end for computing the final predictions. We tested the proposed framework with two datasets; the first is a video of ocean waves and the second one is a video of clouds in the sky.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.