Research Question: How can we develop a vision-language model (VLM) for classification and segmentation of remoting sensing images?
I will be working with the Cornell Graphics and Vision Group as I try to create an effective zero-shot segmentation model for open-world vocabulary. I am starting by doing a literature review on the topic, and then transitioning to experimenting with the data visualization. Finally, I will define a model that is effective at the aforementioned task.
Student: Vipin Gunda
Advisors: Kavita Bala, Bharath Hariharan, Top Piriyakulkij