I am a PhD student at MBZUAI π€, where I focus on multimodal NLP topics, particularly vision-language interaction. I also work on vision-language alignment, unified (any) multimodal models, and large-scale evaluations. I am advised by Alham Fikri Aji and Yova Kementchedjhieva.
Previously, I was a Research Engineer at SMU πΈπ¬ working at the intersection of multilingual and multimodal interpretation under Chong-Wah Ngo. Before that, I earned my bachelorβs degree in CS from Institut Teknologi Bandung, where I worked under Ayu Purwarianti on explainable multimodal synthetic data generation.
In addition to research, I also do AI engineering for various use cases. You can see my other experiences here.
Further, I plan to specialize my studies in developing methods to better align different modalities, with the goal of mitigating modality imbalance, especially in the avenue of unified multimodal models.
During my studies and prior experience, I have often worked on topics including, but not limited to, the following:
EMNLP
MRL @ EMNLP
NAACL
NAACL
COLING
EMNLP
APSIPA ASC