I am originally from the Chicagoland area, where I attended Illinois Math and Science Academy. I got my Bachlor's of Science in Computer Science from the University of Texas at Austin, where I was part of the Turing Schors Honors Program. While there I conducted machine learning based research in computer vision and audio processing under the supervision of Dr. David Harwath.
View and download my resume here. Information is also available on my LinkedIn page.
When watching live television, audio and video can become desynchronized, however, it is not immediately apparent how to correct this. Current models focus on simple data such as people talking, lions roaring, dogs barking, etc. This is mainly because there is limited data that is tailored to evaluate real-world use cases for this problem. Since there is no standard dataset being used on these models, comparisons between them can be near impossible. We were able to create a new dataset that incorporates live television for better comparison between models. This dataset not only allows us to benchmark models on data that represents real-world examples but also allows us to optimize models for these real-world use cases. Along with this dataset, we built on to an existing model to better classify the offset between audio and video. We currently have inconclusive results for this model, as there was no significant increase in accuracy. However, there is much more to be explored in terms of variations on this model.
Click on the dropdowns to learn more about my projects.