Month 3 - Object Matching Implementation
Month three of working on LA3DG was all about object matching. This involved extracting monocular depth estimates from the downloaded imagery and its associated metadata. To do this I used Depth Anything V3. Depth Anything provides a depth map of a resized version of the original image. Using this depth map, I was able to resize and overlay object masks to find the bottom-closest pixel of the object to the camera. This allows me to estimate the distance of the object from the camera, as well as the object’s angle from the center of the image. By combining the object’s angle with the heading of the camera, I can then estimate the coordinates of the bottom-closest pixel. If two objects, from separate images are then found to be within two meters of one another, they are potentially the matching.
Over spring break I plan to train and implement a feature matching AI module to further validate the possible matches. I look forward to this aspect of the project because my main area of focus is AI. I feel that I can setup individual AI models and chain them together, but I have little experience actually preparing data and training AI to perform specialized tasks. This is something I am excited to learn and look forward to next month and over spring break.