Rehan Malik rehan243

About Me — AI/ML Engineer | Generative AI | LLM Systems

i'm an AI/ML engineer based in the US, currently building production AI systems at Reallytics.ai and Verticiti. most of my work revolves around getting large language models to do useful things in production — not toy demos, actual systems handling real traffic.

before this, i spent years at Afiniti and Cloud Kinetics doing the grunt work of making ML models reliable at scale. fraud detection, voice analytics, enterprise search — the kind of stuff that breaks at 3am and you have to fix.

what keeps me going: that moment when an AI agent you built actually solves a problem you didn't explicitly program it for. still hits different every time.

right now i'm deep into:

multi-agent systems that coordinate without falling apart
RAG pipelines that actually find what you're looking for
writing daily about what i learn — AI Engineering Notes

developer coding animation — AI engineer at work

Featured Projects — AI Agents, RAG, LLM Fine-Tuning

Agentic AI Workflows — Production AI Agents 8 specialized AI agents with LangChain + OpenAI function calling. multi-agent orchestration with planning loops and guardrails. the project i'm most excited about.	RAG Enterprise Search — Retrieval-Augmented Generation production retrieval pipeline over 2TB+ data. LangChain, FAISS, ChromaDB, cross-encoder re-ranking. deployed on AWS SageMaker.
Voice AI Platform — Real-Time Speech AI real-time voice infrastructure handling 500+ concurrent calls. WebSockets, Apache Kafka, gRPC with CUDA. speech-to-text, sentiment analysis.	LLM Fine-Tuning (LoRA/QLoRA) — Parameter-Efficient Fine-Tuning fine-tuning LLaMA-2 and Mistral with LoRA/QLoRA/PEFT. 40% cost reduction vs hosted APIs. vLLM serving on SageMaker.
RLHF LLM Optimization — Reinforcement Learning from Human Feedback full RLHF pipeline — supervised fine-tuning, reward modeling, PPO with KL constraints. 68% win rate, 96% safety compliance.	Sentinel Fraud Detection — Explainable AI ensemble XGBoost + Isolation Forest with 650+ engineered features. SHAP explainability, UMAP clustering, GenAI reports via Amazon Bedrock.

Tech Stack — Python, PyTorch, LangChain, AWS, Docker

i'm not going to pretend i use everything equally. here's what i actually reach for day-to-day:

the full picture (click to expand)


daily drivers	Python, PyTorch, FastAPI, Docker, Git, VS Code
LLM & GenAI	LangChain, LlamaIndex, HuggingFace Transformers, vLLM, PEFT/LoRA/QLoRA
vector & data	FAISS, ChromaDB, Pinecone, PostgreSQL, MongoDB, Redis, Kafka, Elasticsearch
cloud & MLOps	AWS (SageMaker, Bedrock, Lambda, ECS), GCP Vertex AI, Azure OpenAI
ML frameworks	TensorFlow, scikit-learn, XGBoost, LightGBM, ONNX
infrastructure	Kubernetes, Terraform, GitHub Actions, MLflow, Weights & Biases

GitHub Stats

i commit a lot. sometimes it's good code, sometimes it's "fix: typo in typo fix".

GitHub Trophies

Contribution Activity Graph

Contribution Snake Animation

Latest AI Research Articles

i publish research notes daily — not polished papers, just honest writeups of what i'm learning and building. think of it as a public lab notebook for generative AI, LLM fine-tuning, RAG, and agentic systems.

Automl For Complex Workflows And Pipelines _2026-04-22	Retrieval Augmented Generation Rag Systems At Sc _2026-04-21
Explainability Techniques For Computer Vision Mode _2026-04-20	Ai Safety And Alignment Engineering _2026-04-19