Shariar Kabir

AI Research Engineer · Dhaka, Bangladesh · shariar1405076@gmail.com

Hi, my name is Shariar (/ʃɑːriˈɑːr/ 🔊). My research focuses on interpretability and behavioral evaluation of AI models, with related questions of safety, robustness, and fairness. I am particularly interested in how LLMs' behavior evolves over longer contexts such as multi-turn interactions, how their internal mechanisms can be made interpretable, and how fairness can be ensured through principled interventions.

In Spring 2026, I joined SPAR to work on real-time automated mechanistic interpretability methods for AI safety, under the mentorship of Sriram Balasubramanian.

Previously, I was a research intern at the NLP Lab in UC Riverside, advised by Prof. Yue Dong, where I was also fortunate to work with Prof. Kevin Esterling. I worked on behavioral evaluation of LLMs and mechanistic interpretability, and also explored how psychometric and Bayesian modeling techniques can quantify and explain complex social behaviors in LLMs.

Prior to that, I worked on inclusive AI systems for low-resource languages, including Bengali medical ASR and document understanding tools. My long-term goal is to build methods that make AI systems not only capable but also transparent, stable, and socially aligned. Some of my earlier work applied ML in other domains, including cloud systems and bioinformatics.

I currently lead the AI Research and Engineering team at Celloscope Ltd. I hold a BSc and MSc in Computer Science and Engineering from Bangladesh University of Engineering and Technology (BUET). My detailed CV can be found here.

News

[2/7/2026]: 🥳🎉 I've been accepted to the SPAR Spring 2026, where I'll be working on real-time automated interpretability methods for AI safety, mentored by Sriram Balasubramanian.
[12/23/2025]: 🎉 Our paper AgnoSVD: Dynamic Resource Allocation for Serverless Workloads using Collaborative Filtering has been published in the journal of ARRAY.
[4/25/2025]: 📢 Our paper titled: Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models is available on arXiv.
[9/22/2024]: 🏆 Our solution AmarDoctor was selected at the 2024 Global Health Equity Challenge.
[6/16/2024]: 📢 The preprint of our work on Automatic Speech Recognition for Biomedical Data in Bengali Language is available on arXiv.
[10/11/2023]: 🎉 Our paper SynthNID: Synthetic Data to Improve End-to-end Bangla Document Key Information Extraction has been accepeted at BLP workshop at EMNLP 2023.

Research

Robustness

LLMs often exhibit unstable reasoning across dialogue turns, arbitrarily shifting stances or tone in response to minor prompt variations. My work stress-tests model opinions and investigates interpretable methods to quantify and improve behavioral stability in sensitive downstream tasks such as long-form multifaceted summarization.

PReSS: A Black-Box Framework for Evaluating Political Stance Stability in LLMs via Argumentative Pressure [PDF]
Shariar Kabir, Kevin Esterling, Yue Dong
Under Review

Description

A black-box framework that stress-tests LLMs' stated opinions using argumentative challenges, classifying responses as stable or unstable across various political topics. Applied to 12 widely-used LLMs across 19 political topics, it reveals substantial variation in stance stability—a model that is left-leaning overall can exhibit stable-right behavior on certain topics. Stability has practical implications for controlled generation and model alignment: when models are prompted or fine-tuned to adopt the opposite ideology, unstable topic stances are more susceptible to change whereas stable ones resist modification.

Interpretability

Modern interpretability methods remain underexplored for analyzing socially complex behaviors such as moral and political reasoning. My work seeks to identify model parameters and computational circuits responsible for specific behavioral traits, and to build scalable tools for automated circuit analysis that can move beyond single-prompt inspection.

ONGOING · SPAR
Automated Circuit Analysis for Real-time Internal Monitoring
Mentored by Sriram Balasubramanian (University of Maryland, College Park)
Research Areas: Mechanistic Interpretability · AI Control

Description

A critical bottleneck in mechanistic interpretability is that circuits—computational graphs revealing LLM internals—are highly information-dense and inherently prompt-specific, making large-scale behavioral analysis prohibitively laborious. While open-source tools like circuit tracer can now construct circuits for arbitrary prompts, two challenges remain: (1) raw circuits require significant human effort to cluster into interpretable supernodes, and (2) understanding an LLM's behavior on a task demands dataset-level analysis that manual inspection cannot scale to.

This project develops an LLM-based agent that takes raw circuits as input and produces actionable outputs—such as probes to detect future undesirable behaviors or surgical interventions to modify targeted behavioral tendencies. The agent-driven framework enables dataset-level circuit analysis for the first time, moving beyond single-prompt inspection toward scalable, automated interpretability. Target analyses include refusal, scheming, and internal planning.

Target venues: ICLR / NeurIPS / ICML Deliverables: Open-source repository + technical blog post

Beyond the Surface: Probing the Ideological Depth of Large Language Models [PDF]
Shariar Kabir, Kevin Esterling, Yue Dong
arXiv preprint arXiv:2508.21448 (2025)

Description

Isolates model components responsible for specific behavioral differences through targeted ablation studies. Defines ideological depth as (i) a model's steerability and (ii) the feature richness of its internal political representations measured with sparse autoencoders (SAEs). Finds large systematic differences between Llama-3.1-8B and Gemma-2-9B, showing that refusals on benign political prompts can arise from capability deficits rather than safety guardrails—making ideological depth a measurable and tunable property of LLMs.

Fairness

LLMs remain systematically biased in socially sensitive domains. My early work centered on improving inclusivity for underrepresented groups through curated resources and domain-specific systems. My current direction moves beyond empirical fine-tuning toward targeted interventions that identify and modify causal mechanisms underlying biased behavior.

Automatic Speech Recognition for Biomedical Data in Bengali Language [PDF]
Shariar Kabir, Nazmun Nahar, Shyamasree Saha, Mamunur Rashid
arXiv preprint arXiv:2406.12931 (2024)

Description

Curated a Bengali biomedical speech corpus for medical ASR covering two major Bengali dialects (Bengali and Sylheti). Trained and evaluated two popular ASR frameworks on a comprehensive 46-hour domain-specific corpus, achieving a WER of 8% on a symptom-focused vocabulary. This work addresses the lack of domain-specific data that limits practical healthcare ASR for underserved linguistic communities.

SynthNID: Synthetic Data to Improve End-to-end Bangla Document Key Information Extraction [PDF]
Syed Monsur*, Shariar Kabir*, Sakib Chowdhury*
BLP Workshop at EMNLP 2023, pages 117–123, Singapore.

Description

A system for generating domain-specific document images to fine-tune transformer models for underrepresented languages. SynthNID generates synthetic Bangla NID data that, when mixed with real data, significantly improves Key Information Extraction performance—particularly on Bengali-script fields. The system is easily extendable to generate other types of scanned documents for a wide range of document understanding tasks. (* co-first author)

Applied ML

AgnoSVD: Dynamic Resource Allocation for Serverless Workloads using Collaborative Filtering [PDF]
Shariar Kabir, Muhammad Abdullah Adnan
Array, Volume 29, 2026.

Description

A collaborative filtering-based framework for dynamic resource allocation in serverless workloads. AgnoSVD uses Singular Value Decomposition (SVD) to predict optimal resource configurations while remaining agnostic to specific function details, evaluated on AWS Lambda and Apache OpenWhisk across 99 functional workloads spanning individual functions and chains.

Publications

Please, refer to my Google Scholar profile for a complete list of my publications.

PReSS: A Black-Box Framework for Evaluating Political Stance Stability in LLMs via Argumentative Pressure. [PDF]
Shariar Kabir, Kevin Esterling, Yue Dong,
Under Review

Abstract

Existing evaluations of political bias in large language models (LLMs) typically classify outputs as left- or right-leaning. We extend this perspective by examining how ideological tendencies vary across topics and how consistently models maintain their positions, a property we refer to as stability. To capture this dimension, we propose PReSS (Political Response Stability under Stress), a black-box framework that evaluates LLMs by jointly considering model and topic context, categorizing responses into four stance types: stable-left, unstable-left, stable-right, and unstable-right. Applying PReSS to 12 widely used LLMs across 19 political topics reveals substantial variation in stance stability; for instance, a model that is left-leaning overall can exhibit stable-right behavior on certain topics. This highlights the importance of topic-aware and fine-grained evaluation of political ideologies of LLMs. Moreover, stability has practical implications for controlled generation and model alignment: interventions such as debiasing or ideology reversal should explicitly account for stance stability. Our empirical analyses reveal that when models are prompted or fine-tuned to adopt the opposite ideology, unstable topic stances are more likely to change, whereas stable ones resist modification. Thus, treating stability as a moderating factor provides a principled foundation for understanding, evaluating, and guiding interventions in politically sensitive model behavior.

AgnoSVD: Dynamic resource allocation for serverless workloads using collaborative filtering. [PDF]
Shariar Kabir, Muhammad Abdullah Adnan,
Array, Volume 29, 2026.

Abstract

In serverless computing, determining the optimal resource configurations for workloads poses significant challenges, particularly due to the cloud provider's limited visibility into workload specifics. This complexity is amplified when dealing with diverse workloads that vary in their characteristics. In this paper, we present AgnoSVD, an approach for predicting the optimum resource configuration for an incoming workload using Singular Value Decomposition (SVD). The proposed model uses collaborative filtering to extract the latent factors of the workloads and resource profiles. Therefore, the model remains agnostic to the specific details of the functions and the resource configurations. We tested our approach on well-known serverless systems like AWS lambda and Apache OpenWhisk and evaluated the system using 99 functional workloads. These workloads encompass both individual functions and chains of …

Beyond the Surface: Probing the Ideological Depth of Large Language Models. [PDF]
Shariar Kabir, Kevin Esterling, Yue Dong,
arXiv preprint arXiv:2508.21448 (2025)

Abstract

Large language models (LLMs) display recognizable political leanings, yet they vary significantly in their ability to represent a political orientation consistently. In this paper, we define ideological depth as (i) a model's ability to follow political instructions without failure (steerability), and (ii) the feature richness of its internal political representations measured with sparse autoencoders (SAEs), an unsupervised sparse dictionary learning (SDL) approach. Using Llama-3.1-8B-Instruct and Gemma-2-9B-IT as candidates, we compare prompt-based and activation-steering interventions and probe political features with publicly available SAEs. We find large, systematic differences: Gemma is more steerable in both directions and activates approximately 7.3x more distinct political features than Llama. Furthermore, causal ablations of a small targeted set of Gemma's political features to create a similar feature-poor setting induce consistent shifts in its behavior, with increased rates of refusals across topics. Together, these results indicate that refusals on benign political instructions or prompts can arise from capability deficits rather than safety guardrails. Ideological depth thus emerges as a measurable property of LLMs, and steerability serves as a window into their latent political architecture.

AmarDoctor: An AI-Driven, Multilingual, Voice-Interactive Digital Health Application. [PDF]
Nazmun Nahar, Ritesh Harshad Ruparel, Shariar Kabir, Sumaiya Tasnia Khan, Shyamasree Saha, Mamunur Rashid,
arXiv preprint arXiv:2510.24724 (2025)

Abstract

This study presents AmarDoctor, a multilingual voice-interactive digital health app designed to provide comprehensive patient triage and AI-driven clinical decision support for Bengali speakers, a population largely underserved in access to digital healthcare. AmarDoctor adopts a data-driven approach to strengthen primary care delivery and enable personalized health management. While platforms such as AdaHealth, WebMD, Symptomate, and K-Health have become popular in recent years, they mainly serve European demographics and languages. AmarDoctor addresses this gap with a dual-interface system for both patients and healthcare providers, supporting three major Bengali dialects. At its core, the patient module uses an adaptive questioning algorithm to assess symptoms and guide users toward the appropriate specialist. To overcome digital literacy barriers, it integrates a voice-interactive AI assistant that navigates users through the app services. Complementing this, the clinician-facing interface incorporates AI-powered decision support that enhances workflow efficiency by generating structured provisional diagnoses and treatment recommendations. These outputs inform key services such as e-prescriptions, video consultations, and medical record management. To validate clinical accuracy, the system was evaluated against a gold-standard set of 185 clinical vignettes developed by experienced physicians. Effectiveness was further assessed by comparing AmarDoctor performance with five independent physicians using the same vignette set. Results showed AmarDoctor achieved a top-1 diagnostic precision of 81.08 percent (versus physicians average of 50.27 percent) and a top specialty recommendation precision of 91.35 percent (versus physicians average of 62.6 percent).

Automatic Speech Recognition for Biomedical Data in Bengali Language. [PDF]
Shariar Kabir, Nazmun Nahar, Shyamasree Saha, Mamunur Rashid,
arXiv preprint arXiv:2406.12931 (2024)

Abstract

This paper presents the development of a prototype Automatic Speech Recognition (ASR) system specifically designed for Bengali biomedical data. Recent advancements in Bengali ASR are encouraging, but a lack of domain-specific data limits the creation of practical healthcare ASR models. This project bridges this gap by developing an ASR system tailored for Bengali medical terms like symptoms, severity levels, and diseases, encompassing two major dialects: Bengali and Sylheti. We train and evaluate two popular ASR frameworks on a comprehensive 46-hour Bengali medical corpus. Our core objective is to create deployable health-domain ASR systems for digital health applications, ultimately increasing accessibility for non-technical users in the healthcare sector.

SynthNID: Synthetic Data to Improve End-to-end Bangla Document Key Information Extraction. [PDF]
Syed Monsur, Shariar Kabir, Sakib Chowdhury,
BLP workshop at EMNLP, pages 117–123, Singapore.

Abstract

End-to-end Document Key Information Extraction models require a lot of compute and labeled data to perform well on real datasets. This is particularly challenging for low-resource languages like Bangla where domain-specific multimodal document datasets are scarcely available. In this paper, we have introduced SynthNID, a system to generate domain-specific document image data for training OCR-less end-to-end Key Information Extraction systems. We show the generated data improves the performance of the extraction model on real datasets and the system is easily extendable to generate other types of scanned documents for a wide range of document understanding tasks.

Experience

Supervised Program for Alignment Research (SPAR)

Research Mentee · Mentored by Sriram Balasubramanian [project]

Building automated LLM agents to explore large and complex circuits.
Building tools for the agent for incremental circuit exploration and targeted intervention.
Finding spurious correlations in LLM cross-layer transcoder circuits using proxy tasks.

February 2026 – May 2026

Celloscope Limited

AI Research & Engineering

Lead AI Research Engineer Jan 2024 – Present

AI Software Engineer Jul 2021 – Dec 2023

R&D Engineer Sep 2020 – Jun 2021

I led a team of six research engineers developing production-grade NLP and computer vision systems deployed across multiple industrial domains. Key projects I directed include:

Exercise Monitoring System for LG Nova, which used multimodal pose-estimation and language models to provide real-time feedback on workout form.
Resume Shortlister, a RAG-based ranking engine that matched the requirements from RFPs or job descriptions with candidate resumes using a hybrid approach combining rule-based filtering with semantic retrieval.
Drawing Checker, a vision system to automate design-error detection in engineering drawings through deep-learning-based object detection and geometric analysis.

September 2020 – Present

UCR NLP Lab

Research Intern · Advised by Prof. Yue Dong

Working on methods to combine interpretability tools with fairness diagnostics from social science for designing an intervention that targets emergent activation circuits in LLMs responsible for particular behavioral tendencies.

Understanding LLMs’ response instability over longer context.
Mechanistic Interpretability of LLM in Socio-Political Reasoning.
LLMs’ Social Epistemology using Bayesian Statistics.

January 2025 – December 2025

MedAI Pvt. Limited

NLP and Data Scientist · Part Time

Extracting data-driven insights from medical data of Bangladesh and developing a smart healthcare platform that uses AI to deliver personalised healthcare services in local languages. Major contributions are:

Empowering Mental Health Support for Bengali Speakers through a Conversational AI chatbot.
Synthetic patient generator reflecting local demography.
Classifier for clustering patients disease using symptoms and other demography.
Training and serving of voice-based patients' symptoms collector.
Design and development of audio data collection portal.

August 2021 – November 2024

GRP, ICT Division

DevOps Engineer

Automating the deployment process and monitoring of numerous microservices. Major contributions include:

Automation scripts for deploying web apps and micro-services in Docker
Gateway configuration using NGINX reverse proxy
Document generation scripts from Google Sheets

May 2019 – August 2020

Education

Bangladesh University of Engineering and Technology (BUET)

Master of Science (part time)

Computer Science and Engineering

GPA (coursework): 3.54

Thesis: Dynamic Resource Allocation for Workloads in Serverless Architecture using Collaborative Filtering. Under the supervision of Professor Muhammad Abdullah Adnan.

Coursework: Bioinformatics Algorithms · Distributed Computing Systems · Data Mining · Data Management in the Cloud · Advanced Database Systems · Advanced Artificial Intelligence

April 2019 - October 2022

Bangladesh University of Engineering and Technology (BUET)

Bachelor of Science

Computer Science and Engineering

GPA: 3.53

Major: Artificial Intelligence

Thesis: Active Learning on Big Data; A research on how we can apply active learning on big data in a distributed cloud computing system. Under the supervision of Professor Muhammad Abdullah Adnan.

Coursework: Machine Learning · Pattern Recognition · Computer Graphics · Artificial Intelligence · Digital Image Processing · Data Structures · Database · Operating Systems · Software Development · Computer Architecture · Microprocessors and Microcontrollers · Computer Networks · Concrete Mathematics · Discrete Mathematics · Numerical Methods · Software Engineering and Information System Design · Compiler · Data Communication · Digital Logic Design · Structured Programming Language · Object Oriented Programming Language · Theory of Computation

February 2015 - April 2019

Projects

Medical Code Classification via Linear Probing of LLM Activations
Healthcare AI Interpretability

Detail

This project investigates multi-label medical code classification by training linear probes on Large Language Model (LLM) activations. We extract layer-wise attention head activations from medical-domain LLMs and use Ridge regression classifiers to predict relevant medical disciplines from clinical descriptions. The approach enables interpretable analysis of which model components are most informative for medical domain classification tasks.

Finetuning LLMs for Mental Health Counsel
Healthcare AI NLP

Detail

Recognizing the inherent bias of most LLMs towards European languages and ethnicities and the low resources of structured Bengali data, I focused on refining open-source models like LLaMA using different parameter-efficient fine-tuning (PEFT) techniques (e.g., Adapter injections and LoRA). Successfully fine-tuned LLaMA for Bengali mental health consultation using QLoRA, resulting in a more optimized model that can be served on low GPU memory—ensuring equitable access to healthcare technologies across diverse linguistic communities.

ASR System for Patient Symptoms [PPT]
Healthcare AI Speech AI

Detail

ASR system for understanding medical symptoms spoken by patients in Bengali. Trained DeepSpeech from scratch on audio collected via a consented data collection portal, then finetuned for noisy environments using 13 domain augmentations. Switched to a Whisper (tiny) model finetuned on the BanglaASR corpus (Bangla Mozilla Common Voice), achieving a WER of only 8%—enabled by the limited vocabulary of symptom terms.

Exercise Monitoring System
Healthcare AI Computer Vision

Detail

A system leveraging Vision-Language Models (VLMs) to assist users in performing exercises correctly by comparing their execution against reference videos of expert demonstrations. Uses frame-level visual and motion comparison integrated with language-based feedback to generate natural language guidance that helps users improve their form and reduce injury risk.

SynthCases: Synthetic Patient Creator and Disease Classifier [PPT]
Healthcare AI Machine Learning

Detail

A disease recommendation system using ensemble classifiers trained on synthetic patient data reflecting real-world demographics. The data generator accounts for risk factors, family history, and medical history. The classifier uses a multi-layer pipeline: first predicting disease probabilities from symptoms, then filtering using an ethnicity-based prevalence look-up table, and finally making a final prediction using the patient's risk factors.

wQFMSpark: Performance Analysis of Species Tree Estimation Using wQFM in a Distributed System [Report]
Bioinformatics Distributed Systems

Detail

Species tree estimation from gene trees is crucial in phylogenetics. Quartet-based techniques like ASTRAL, QMC, and wQFM are widely used, but struggle with scalability on large datasets. This project redesigns wQFM for distributed execution using Apache Spark, analyzing the scalability and performance gains on large-scale phylogenomic inputs.

3PC: Implementing the 3-Phase Commit Protocol [Report]
Distributed Systems

Detail

A distributed music playlist system implemented using the three-phase commit (3PC) protocol to guarantee consistency across multiple nodes. The playlist is an unordered set of song name and URL pairs, maintained with full ACID compliance across two or more devices.

Key Information Extraction (KIE) From NID using Donut [PPT]
Document AI NLP

Detail

Fine-tuned the pretrained Donut document transformer model for Key Information Extraction (KIE) on Bangla National ID cards using data generated by SynthNID. Used a mix of real and synthetic data, with the addition of synthetic data yielding significant performance improvements—especially on Bengali-script fields.

Object Detection with YOLO
Computer Vision

Detail

A collection of applied object detection tasks built on YOLO, each fine-tuned for a specific domain:

Licence plate detection — fine-tuned YOLOv5 to detect plates across vehicle classes in Bangladeshi CCTV footage. Colab
Card region extraction — locating and cropping identity card regions from raw photographs for downstream processing.
Engineering drawing checker — detecting standard symbols (e.g. north sign) in scanned engineering drawings to automate compliance checks.

NER From Chatbot User Messages
NLP Banking

Detail

Extraction of named entities (NE) from banking chatbot messages—beneficiary names, transfer amounts, account types, account numbers, etc. Instead of a single model, we used a recipe of approaches: BERT for beneficiary name extraction, RegEx for transfer amounts and account numbers, and lookup tables for account type extraction. This minimized the training burden for Bengali, a low-resource language.

Agrani Voice Banking Chatbot
NLP Speech AI Banking

Detail

Bangladesh's pioneering voice-based AI chatbot for seamless banking activities, serving hundreds of thousands of real users at Agrani Bank—one of Bangladesh's largest state-owned banks. Powered by Bengali ASR and a finetuned NLU engine for natural language-driven fund transfers and inquiries, making banking services accessible to customers with limited digital literacy.

Realtime Liveness Check
Computer Vision Security

Detail

Analyzes real-time facial movements and blinking, and requires the user to perform specific facial actions during the eKYC authentication process to ensure the presence of a live person. Developed for deployment on mobile devices such as smartphones.

Audio Data Collection Portal
Healthcare AI Infrastructure

Detail

Audio data collection portal for large user bases. Built with a React frontend and Python-Flask backend; metadata stored in PostgreSQL, object storage in S3, and full authentication via AWS Cognito. Supports priority-based or demographic-filtered collection, useful for gathering medical recordings segmented by symptom, age, or gender.

AI Service Gateway
Infrastructure

Detail

A portal for showcasing and demo-testing AI services. Authentication and authorization built using Keycloak and Google identity provider. New clients can sign up with their email and receive a limited credit allocation for trying the services.

Don't Drop The Bomb
Hardware Game Dev

Detail

A multiplayer microcontroller game featuring two player-controlled bars on either side of two connected dot matrices. Powered by a single Atmega32 microprocessor, with controllers built using MPU-6050 accelerometer & gyro sensors.

Ray Tracing
Computer Graphics

Detail

A ray tracer that renders spheres, planes, and triangles with textures and shadows by tracing light paths through an image plane. Implements the Phong Lighting Model and recursive reflection for photorealistic lighting effects.

Awards & Achievements

Industry Coding Assessment

CodeSignal General Coding Assessment (ICA): 510/600 (≈ 722/850 equivalent GCA, top 15%)

2025

Global Health Equity Challenge Award

MIT Solve

AmarDoctor by MedAI has been selected as one of the six solvers out of 2200+ participants worldwide for its innovative approach to accessible healthcare.

2024

Interests

Outside of my professional pursuits, I consider myself curious by nature and enjoy learning in general. I am a book lover with a keen interest in classical thrillers, and philosophical novels. I enjoy listening to Bengali folk music and classical rock, and occasionally try my hand at playing Bengali folk melodies on the ukulele.

I love animals and have a soft spot for cats due to their elegance, independent nature and curious spirits. My wife and I take care of two lovely cats, and we warmly invite you to meet them through some of their photos here.

Shariar Kabir

News

Research

Robustness

Interpretability

Fairness

Applied ML

Publications

Experience

Supervised Program for Alignment Research (SPAR)

Celloscope Limited

UCR NLP Lab

MedAI Pvt. Limited

GRP, ICT Division

Education

Bangladesh University of Engineering and Technology (BUET)

Bangladesh University of Engineering and Technology (BUET)

Projects

Awards & Achievements

Industry Coding Assessment

Global Health Equity Challenge Award

Interests

Curriculum Vitae