Skip to content
View nileshsarkarRA's full-sized avatar
🌐
I like to train machine learning models and deep neural networks! :D
🌐
I like to train machine learning models and deep neural networks! :D

Highlights

  • Pro

Block or report nileshsarkarRA

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
nileshsarkarRA/README.md

Hi there, I’m Nilesh Sarkar 👋

AI Researcher | Language Models & AI Systems
Studying how architectural and capacity constraints shape reasoning and representations in neural systems.


🔍 Research Focus

I am an AI researcher working on language models, transformer architectures, and training dynamics, with a focus on architectural analysis, model compression, and interpretability. My work follows an architecture-first approach: implementing and instrumenting models from first principles to study how representations evolve under structural and capacity constraints.

My current research centers on compression-based analysis of large language models, treating pruning, quantization, and distillation as experimental probes rather than purely engineering optimizations.


🧠 What I Work On

  • LLM Architecture & Compression Research
    Designing and training compact transformer language models to study information bottlenecks, robustness, and representation stability under compression.

  • Training Dynamics & Representation Analysis
    Analyzing attention patterns, residual stream behavior, and convergence dynamics across architectural variants and capacity constraints.

  • Applied AI Systems (Safety-Critical Contexts)
    Developing agentic retrieval-augmented generation (RAG) systems for complex aerospace engineering documentation, with emphasis on reliability, verification, and structured reasoning.


🛠 Featured Research Projects

🔹 Architectural Information Bottlenecks in Compressed Language Models

Status: Active research
Stack: PyTorch, CUDA

Department-led research project studying how architectural choices and compression affect internal representations in transformer language models.

  • Training compact transformers from scratch for controlled experimentation
  • Applying pruning, quantization, and distillation as analytical interventions
  • Instrumenting models to study attention behavior, residual streams, and stability
  • Evaluating effects across English and selected Indic languages

🔹 Agentic RAG Systems for Technical Documentation

Stack: LangChain, LangGraph, Vector Databases

Agentic retrieval systems designed for multi-step reasoning over large, structured technical corpora.

  • Query-aware routing between retrieval mechanisms
  • Citation-based retrieval to improve reliability
  • Focus on structured reasoning in safety-critical domains

⚙️ Technical Stack

  • Modeling & Research: PyTorch, CUDA, Transformers
  • LLM Systems: LangChain, LangGraph, Hugging Face
  • Systems & Tools: Linux, Docker, Git, NVIDIA profiling tools

Pinned Loading

  1. Image-Generation-with-Stable-Diffusion-v1.5 Image-Generation-with-Stable-Diffusion-v1.5 Public

    This project uses the Stable Diffusion v1.5 model from RunwayML to generate high-quality images from descriptive text prompts. Built with the diffusers library, it supports GPU acceleration, negati…

    Jupyter Notebook 1

  2. Deep-Learning-Convolutional-Neural-Network-Weather-Prediction Deep-Learning-Convolutional-Neural-Network-Weather-Prediction Public

    A CNN-based project for classifying weather phenomena using a dataset of 6,862 images across 11 categories. Includes data preprocessing, model training, evaluation, and visualization. Ideal for wea…

    Jupyter Notebook 1

  3. AI_Tutors_Using_Instruction_Tuned_Models AI_Tutors_Using_Instruction_Tuned_Models Public

    This project builds an AI-powered Physics Tutor using the Dolly-v2-3b instruction-tuned model. It explains physics concepts in simple terms via a Gradio-based user interface. The model leverages Hu…

    Jupyter Notebook 1

  4. SLAM-Algorithm-Development-for-Autonomous-Vehicle-Using-ROS SLAM-Algorithm-Development-for-Autonomous-Vehicle-Using-ROS Public

    SLAM Algorithm Development for Autonomous Vehicles ( Autonomous Mobile Robots ) using ROS.

    Python 1