The Objectron dataset consists of several short, object-centric video clips where the camera steadily moves around the object and captures it from different angles. The videos contain manually annotated 3D bounding boxes describing the object's position, orientation, and dimensions. The dataset also comes with the metadata from AR sessions, including camera poses, sparse point clouds, and characterization of the planar surfaces from the surrounding environment. The dataset is collected from 10 countries across five continents, thus ensuring geo-diversity. It consists of 17,095 object instances with 14,819 annotated video clips complemented with 4M annotated images in the following categories: bikes, books, cameras, cereal boxes, chairs, cups, laptops, and shoes.