Month 2 - Image Gathering, Filtering, and Cleaning

By: Johnathan Coots

For month 2 I focused on the gathering of images via Mapillary API, filtering out indoor images via Places365, and inpainting dynamic objects via RF-DETR image segmentation and LaMa inpainting.

Using the Mapillary API was a little difficult for me due to my inexperience with web sessions, but their documentation made the requesting understandable. Essentially, you ask for all the image information in a box that is dictated by the minimum and maximum latitudes and longitudes of the desired area. That returns a list of metadata that includes the associated image IDs. You then send a follow-up request to download the images as JSON objects via their image ID. The only caveats to this process are the limit to area size (0.01 square-geographic-degrees, which is huge) and the maximum image count of 2000 images. I did run into some confusion with the wording about pagination of responses regarding the 2000 image limit. Initially I thought the types of requests I would be sending would require me to follow up with a “Next Page” response, but after some careful reading of the documentation I believe this is only necessary if a sub-user is accessing the API via my access token.

Mapillary API documentation.

The Places365 implementation went smoothly due to previous experience with the repository, and I am successfully filtering indoor images from the data set.

I am currently running into some difficulties with the LaMa inpainting. During the implementation of the LaMa repository, I ran into some package conflicts. To try and avoid this I tried using an alternative repository, but the results have been subpar. A potential issue is that I am using a RF-DETR segmentation model when I previously used an Ultralytics one, but I do not believe this is the issue so far. I have inspected the image masks RF-DETR model created and they seem accurate. Instead, I will return to the original LaMa model I planned on using and try to resolve the package conflicts.

Current inpainting implemented in LA3DG.

Inpainting achieved in a previous project.

Once the inpainting has been improved I plan on implementing the MiDaS depth estimation. This will lead to the object matching and triangulation portions of the pipeline. My major goals for month 3 are to implement the object matching, triangulation, and NeRF portions of the pipeline so that I can spend month 4 focusing on paper work a refinement of the application.

MiDaS visualization.

Next
Next

Month 1 - The Proposal