This year was huge for me in the field of machine learning and computer vision in particular. A bit more than a year ago I would never believe that I would spend a week abroad not…
Announcing the SUMO challenge - a contest to encourage the development of algorithms for complete understanding of 3D indoor scenes from 360° RGB-D panoramas with the goal of enabling social AR and VR research and experiences.
Gibson’s underlying database of spaces includes 572 full buildings composed of 1447 floors covering a total area of 211k m2s. The database is collected from real indoor spaces using 3D scanning and reconstruction. For each space, we provide: the 3D reconstruction, RGB images, depth, surface normal, and for a fraction of the spaces, semantic object annotations. In this page you can see various visualizations for each space, including 3D dissections, exploration using a randomly controlled husky agent, and standard point-to-point navigation episodes
Written by Dheepan Ramanan (@dheepan_ramanan), Data Scientist and Ivan Kopas (@ivan_kopas), Machine Learning Engineer Last Friday ARK Invest released a new price target for Tesla as well as an updated, open-source model. The scale of autonomous ride hailing networks and ARK’s estimate for Tesla’s dominance emerged as the most contentious elements in the model. These components contribute nearly 50% of ARK’s $3k 2025 price target. On twitter there has been considerable debate on the size of the Robotaxi market and Tesla’s lead in autonomous driving, questioning whether Tesla’s Full Self Driving (FSD) approach can be reverse-engineered and replicated by the competitors.
VGG Image Annotator (VIA) is an image annotation tool that can be used to define regions in an image and create textual descriptions of those regions. VIA is an open source project developed at the Visual Geometry Group and released under the BSD-2 clause license.