I'm a PhD student at Aalborg University working at the Visual Analysis and Perception Lab. My research focuses on the geometric foundations of multimodal AI, particularly exploring how non-Euclidean spaces can better capture the hierarchical and compositional nature of visual and temporal data.

I completed my academic training at the University of Barcelona, where I earned a Bachelor's degree in Mathematics and two Master's degrees in Advanced Mathematics and Data Science. During my master's studies, I worked as a research assistant at the HuPBA group in Barcelona, before joining the VAP Lab at Aalborg University, where I currently pursue my PhD.

Research focus

My work addresses a fundamental limitation in current vision-language models: they treat images and text as flat entities, ignoring their inherent compositional and hierarchical structure. Visual scenes are structured compositions where objects exist within contexts, forming spatial hierarchies. When extended to video, temporal dynamics introduce additional complexity that conventional Euclidean representations struggle to capture.

I'm investigating whether hyperbolic geometry, with its natural capacity for representing hierarchies, can adequately capture the full complexity of spatio-temporal-semantic relationships in video. This includes exploring hybrid geometric approaches that might combine hyperbolic spaces for hierarchical relationships with alternative frameworks for periodic or cyclical temporal patterns.

Publications

Beyond research

In my free time, I help with my family's business, Insígnies Pujol, a badge and insignia manufacturing company based in Barcelona.

Libertas perfundet omnia luce

en | cat | dk