group 2024 banner
header-image_0001_layer_4.jpg
header-image_0003_layer_0.jpg
header-image_0002_layer_1.jpg

Research

The Vision and Learning group works on a variety of cutting-edge problems in computer vision and machine learning. The exact problems vary over time. The figure below summarizes the broad research directions in the group. 

Tracking, Activity Recognition and Scene Understanding

A grid of eight security camera frames showing indoor hallways and outdoor walkways. Pedestrians walking through the spaces are highlighted and enclosed in various colored bounding boxes, demonstrating a multi-camera pedestrian tracking system.

 

Human Motion and Pose

A split-screen research graphic demonstrating 3D human motion capture from soccer footage. On the left, a close-up photo shows two soccer players, one in a yellow Brazil jersey labeled with the number 2, competing for the ball. On the right, the scene is decomposed into 3D wireframe models of the players, surrounded by multiple isolated camera perspectives capturing the action from different angles.

 

Vision Language Models

Vision and Security/Privacy

A diagram illustrating a facial swap or deepfake generation process. Two source images at the top—a news anchor wearing a traditional white ghutrah and igal, and a man in a dark blue sweater—are joined by a blue plus sign. An equals sign points down to a final synthesized image showing the news anchor's body with a altered facial expression matching the second man.

Learning for Robot Autonomy

A 3D simulation environment featuring a black-and-white checkered floor. A four-legged, tan-colored robotic creature sits inside a track bounded by transparent gray walls, with a red-and-gray sphere positioned further down the lane.

Image Analysis for Scientific Applications

A 3x3 grid of images displaying cellular structures segmented into multi-colored patches on a black background. Black arrows run along the top border from left to right and down the left border from top to bottom, indicating an analytical progression or variation in the image sequence.