Hi There! π
My name is Shariar (/ΚΙΛriΛΙΛr/ π). I am currently working on AI Safety and Reliability via Interpretability. I am particularly interested in how LLMsβ behavior evolves over longer contexts such as multi-turn interactions, how their internal mechanisms can be made interpretable, and how fairness can be ensured through targeted interventions.
I am actively seeking PhD opportunities in AI safety, interpretability, and reliability. If our research interests align, or youβd like to collaborate, please feel free to reach out !
In Spring 2026, I joined SPAR
to work on real-time automated mechanistic interpretability methods for AI safety, under the mentorship of Sriram Balasubramanian.
Previously, I was a research intern at the NLP Lab in UC Riverside, under Prof. Yue Dong, where I was also fortunate to work with Prof. Kevin Esterling. I worked on behavioral evaluation of LLMs, and explored how psychometric and Bayesian modeling techniques can quantify and explain complex social behaviors in LLMs.
Prior to that, I worked on inclusive AI systems for low-resource languages, including Bengali medical ASR and document understanding tools. Some of my earlier work applied ML in other domains, including cloud systems and bioinformatics.
I led the AI Research and Engineering team at Celloscope Ltd. I hold a BSc and MSc in Computer Science and Engineering from Bangladesh University of Engineering and Technology (BUET). My detailed CV can be found here.
News
- [3/26/2026] π My paper PReSS: An Automated Black-Box Framework for Evaluating Political Stance Stability in LLMs has been accepted at PoliticalNLP @ LREC-COLING 2026.
- [2/7/2026] π₯³π Iβve been accepted to SPAR Spring 2026, where Iβll be working on real-time automated interpretability methods for AI safety, mentored by Sriram Balasubramanian.
- [12/23/2025] π My paper AgnoSVD: Dynamic Resource Allocation for Serverless Workloads using Collaborative Filtering has been published in the journal Array.
- [4/25/2025] π’ My paper Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models is available on arXiv.
- [9/22/2024] π Our solution AmarDoctor was selected at the 2024 Global Health Equity Challenge.
- [6/16/2024] π’ The preprint of my work on Automatic Speech Recognition for Biomedical Data in Bengali Language is available on arXiv.
- [10/11/2023] π Our paper SynthNID: Synthetic Data to Improve End-to-end Bangla Document Key Information Extraction has been accepted at the BLP workshop at EMNLP 2023.
Research
I am working on using mechanistic interpretability as a practical tool for AI safety building methods that scale beyond toy settings and validating them on real model behaviors. Iβm especially interested in using LLM agents to automate interpretability (autointerp). For example, turning circuit analysis from manual, single-prompt inspection into a scalable process.
- Agents for autointerp: building agentic pipelines that discover, label, and validate circuits and features at scale, so interpretability keeps pace with model capability instead of lagging on isolated examples.
- Interpretability for safety: locating and editing the causal mechanisms behind undesirable behavior: systematic bias in sensitive domains, unstable reasoning, moving toward targeted, mechanism-level interventions.
- Reliability under real use: LLMs shift stance and tone under minor prompt changes; I am interest in designing benchmarks to measure their stability and build interpretable methods to measure and improve stability in multi-turn settings.
See my publications for details.

