About

I am Gautam Rana. Softmaxxing Entropy of Auto regressive World Models

The 10X Philosophy

HUH! Not using that frontal lobe

The Arsenal

  • Flutter & Dart (Mobile Architecture)
  • Python & JS (Scalable Systems)
  • Rust (Text Encoder Inference)
  • Machine Learning (Edge AI, ONNX)

Work Experience

Junior AI and ML Engineer

Jan 2025 – Present

Propelius Technologies

  • Architected an end-to-end MLOps pipeline with fail-safe architecture that ingests product catalog PDFs, processes them through Qwen 2.5-VL hosted on NVIDIA A100, and outputs structured data, fully replacing third-party API dependencies.
  • Implemented VLLM PagedAttention for model serving, reducing per-page inference latency from 5 minutes to 20 seconds, a 15x speedup, while maintaining output quality at production scale.
  • Wrote production-grade Python across the full pipeline including PDF ingestion, model inference, post-processing, and structured output with error recovery and retry logic.
  • Designed a Dockerized modular workflow for document extraction and image upscaling using RealESRGAN, ensuring consistent and reproducible deployments across staging and production environments.

Technical Content Writer

Jul 2024 – Jan 2025

Geeks For Geeks

  • Authored 20+ technical articles on Git, JavaScript, and System Design, collectively reaching 8,000+ page views.

Education

Bachelor of Science in Computer Science

Expected: 2026
Uka Tarsadia UniversitySurat, Gujarat
"Talk is cheap. Show me the code."