Immersive applications such as augmented reality (AR) and virtual reality (VR) are gaining more and more attention thanks to rapid advancements in the field. A mobile game with added AR elements or a movie played in VR glasses is an enhanced user experience.
Preparing 3D content for immersive multimedia experiences is a challenging and challenging task. One can use multi-view camera setups to capture 3D objects which can provide high quality assets, but it would not be possible to do this for every object in the world. Therefore, more practical approaches are needed to extract 3D information using standard cameras.
These days there are dozens or hundreds of images available on the internet for almost anything you can think of. There are several ways to extract a set of images for a given object, from image collections offered by image search results to product review photographs. What if we could use this huge data to build 3D objects in AR and/or VR applications?
This is the question that SAMURAI attempts to answer. The inverted representation of an object in completely unknown capture conditions is a key problem, and they have an answer to that.
However, it is not an easy task to render 3D objects from unknown images. Since images are so limitless and have different backgrounds, lighting, and inherent camera characteristics, estimating 3D shapes and materials from Internet images presents several challenges. However, it is not an easy task to render 3D objects from unknown images. Therefore, a joint optimization of form, lighting and pose is necessary.
Usually in 3D rendering the goal is to estimate the 3D shape and Bi-directional reflectance distribution function (BRDF) object properties. However, when doing this via images where the capture information is unknown, per-image illumination, camera positions, and intrinsics must also be estimated.
Several recent studies on shape and material estimation make the following assumptions: unchanged camera intrinsics, nearly flawless segmentation masks, and nearly accurate camera pose. However, this is impractical in real-world scenarios.
SAMURAI is proposed to jointly estimate shape, BRDF and posture and camera illumination per image. This is a very under-constrained and complicated optimization problem when only image collections and imprecise camera position quadrants are provided as input. SAMURAI addresses this difficult challenge with thoughtful approaches to camera setup and optimization.
SAMURAI can achieve flexible camera setup for different distances by learning clipping planes for each image and describing the neural volume in global coordinates.
Additionally, since single-camera-per-frame optimization can get stuck in local minima, SAMURAI uses a multiplex camera setup where multiple camera poses are optimized per frame.
Not all input images are equally useful for optimization because different input images have different noise. Thus, SAMURAI uses posterior scaling of the input photos, which evaluates the impact of various images on the optimization.
SAMURAI produces 3D models for various visual applications including AR/VR, games, hardware editing, etc. SAMURAI can extract explicit meshes with a BRDF texture, making the resulting 3D models easily usable in current graphics engines. When applied to existing datasets, SAMURAI showed superior view synthesis and relighting results.
This was a brief summary of SAMURAI. Below are some useful links if you want to learn more about it.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and code. Please Don't Forget To Join Our ML Subreddit
Ekrem Ãetinkaya obtained his B.Sc. in 2018 and M.Sc. in 2019 from Ozyegin University, Istanbul, TÃ¼rkiye. He wrote his M.Sc. thesis on image denoising using deep convolutional networks. He is currently pursuing a doctorate. degree at the University of Klagenfurt, Austria, and working as a researcher on the ATHENA project. His research interests include deep learning, computer vision and multimedia networks.