Data Collection

This work is done as part of my research assistant position at Computer Vision and Machine Perception Lab. The task is collect multiview images of 120 specular objects using mobile held camera in good and bad lighting conditions. Process them to obtain poses, 2D object masks and corresponding 3D bounding box.

Setup

The video showcases the configuration for the dataset capture. For larger objects, an April tag with a size of 60mm was utilized, and for smaller objects were associated with an April tag of 30mm. Additionally, few stickers with random markings were affixed to enhance feature matching consistency across frames.

Pose Estimation

We sample 120 frames and used COLMAP to estimate the poses. We provide COLMAP with the calibrated camera intrinsics. Given that the input images for COLMAP comprise a sequence of videos, we utilize a sequential matcher for the feature matching stage. However, due to the resemblance in features among April tags, the initial pose estimation yielded unsatisfactory results. Subsequently, incorporating colored stickers led to a enhancement in the accuracy of the pose estimation.