Esteve Valls Mascaro


This is the profile picture for Esteve Valls Mascaro

© Esteve Valls Mascaro

Welcome! My name is Esteve Valls, and I am PhD Researcher at TUWien. My work is focused on Human intention recognition and prediction for Human Robot Collaborative behaviours.

I am passionate about exploring the intricate workings of human cognitive levels, and how they can be leveraged to enhance the social awareness of robots. My research focuses on using computer vision and deep learning to improve the perception of robots, enabling them to function seamlessly in populated environments. As I delve deeper into this field, I am constantly amazed by the possibilities that lie ahead, and the potential for these innovations to change the world we live in.

If you are interested in learning more about my work and seeing examples of it in action, please visit my personal webpage, opens an external URL in a new window. There, you will find more information regarding our journey that provide a deeper insight into my research.

If you have any questions or are interested in collaborating, please do not hesitate to contact me.



  • Since June 2022 - PhD in PERSEO Program, TU Wien, Austria.
  • September 2021 to June 2022 - PhD in PERSEO Program, Technical University of Munich (TUM), Germany.
  • September 2020 to July 2021 - M.Sc. Advanced Telecommunications Technologies, Universitat Politecnica de Catalunya (UPC), Spain.
  • September 2016 to July 2020 - B.Sc. Engineer of Technologies and Services of Telecommunications, Universitat Politecnica de Catalunya (UPC), Spain.


  • PERSEO - PErsonalized Robotics as SErvice Oriented applications, Marie Sklodowska-Curie Actions - Innovative Training Networks (H2020-MSCA-ITN-2020 No. 955778), 1.1.2021-31.12.2024


Journal papers

  1. Ni, Z., Mascaró, E. V., Ahn, H., & Lee, D. (2023). Human-object interaction prediction in videos through gaze following. Computer Vision and Image Understanding, 103741. Paper | ArXiv Paper  | Project Page

Conference papers

  1. Mascaró, E. V., Ahn, H., & Lee, D. (2023). Intention-Conditioned Long-Term Human Egocentric Action Anticipation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 6048-6057).  Project Page | Paper
  2. E. V. Mascaro, S. Ma, H. Ahn and D. Lee, Robust Human Motion Forecasting using Transformer-based Model, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 2022, pp. 10674-10680, doi: 10.1109/IROS47612.2022.9981877.  Project Page | Paper | ArXiv Paper
  3. Ahn, H., Mascaro, E. V., & Lee, D. (2023). "Can We Use Diffusion Probabilistic Models for 3D Motion Prediction?,  2023 IEEE International Conference on Robotics and Automation (ICRA) . Project page | ArXiV Paper


  1. Rank 1st in the EGO4D Long-Term Action Anticipation CVPR@2022 META AI Challenge. Oral presentation in the EGO4D workshop, 2022 IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), New Orleans , United States. Workshop Page
  2. Rank 1st in the EGO4D Long-Term Action Anticipation ECCV@2022 META AI Challenge.  EGO4D workshop,  2022 IEEE / CVF European Conference on Computer Vision (ECCV),  Tel-Aviv, Israel, 2022. Workshop Page