• Depth Perception

  • The video presents Robust Depth, a deep learning-based model to generate depth images from monocular images.
    October 2023
  • Human-aware Navigation

  • The video shows the output of our Graph Neural Network estimating the level of social compliance while the robot is moving in a simulated scenario.
    September 2019
  • Grasping

  • Learning to Grasp (L2G) is an efficient end-to-end learning strategy to generate 6-DOF parallel-jaw grasps starting from a partial point cloud of an object. Our approach does not exploit any geometric assumption, it is instead guided by a principled multi-task optimization objective that generates a diverse set of grasps by combining contact point sampling, grasp regression, and grasp evaluation. This video shows real-world experiments with grasps predicted from our L2G model and the baseline GPNet.
    March 2022
  • Image Captioning

  • This video demonstrates EgoFormer, a two-stream transformer based deep neural network utilizing visual-contextual attention. EgoFormer accomplishes accurate and human-alike scene understanding with the aid of context encoding. The context encoder is a pre-trained ViT encoder, which is subsequently fine-tuned on EgoCap context classification tasks, namely where, when, and whom.
    November 2022
  • Assistive Robotics

  • As the video on the right-hand side, this video shows our robot Shelly in a fetch-and-deliver task. The two arms previously found in Shelly have been replaced by a single arm with zero backslash and more precision. It must be noted that in this case the video is played in real time: without backslash in the joints visual servoing is unnecessary.
    October 2016

  • This video shows how our robot Shelly fetches an object from a table and delivers it to a human. It was recorded for the oral defense of the Ph.D. thesis of our former student and colleague Luis V. Calderita. One of the biggest difficulties (in addition to the fact that, thanks to AGM, the robot is capable of maintaining a model of the environment, the objects and humans in it) was to achieve this task without a fine kinematic calibration, using visual feedback instead. Although calibrating the robot would make it perform the task faster, avoiding the need of a proper calibration allowed us to demonstrate how robust can a robot be using visual servoing.
    February 2016
  • Active perception

  • This video shows our robot Gualzru finding a coffee mug for us. As an intermediate step it also models the room in which it is located and finds a table. This video was recorded as part of my Ph.D. thesis.
    The robot uses AGM and RoboComp.
    June 2013