About
I am Gautam Rana. Softmaxxing Entropy of Auto regressive World Models
The 10X Philosophy
HUH! Not using that frontal lobe
The Arsenal
- Flutter & Dart (Mobile Architecture)
- Python & JS (Scalable Systems)
- Rust (Text Encoder Inference)
- Machine Learning (Edge AI, ONNX)
Work Experience
Junior AI and ML Engineer
Jan 2025 – PresentPropelius Technologies
- Architected an end-to-end MLOps pipeline with fail-safe architecture that ingests product catalog PDFs, processes them through Qwen 2.5-VL hosted on NVIDIA A100, and outputs structured data, fully replacing third-party API dependencies.
- Implemented VLLM PagedAttention for model serving, reducing per-page inference latency from 5 minutes to 20 seconds, a 15x speedup, while maintaining output quality at production scale.
- Wrote production-grade Python across the full pipeline including PDF ingestion, model inference, post-processing, and structured output with error recovery and retry logic.
- Designed a Dockerized modular workflow for document extraction and image upscaling using RealESRGAN, ensuring consistent and reproducible deployments across staging and production environments.
Technical Content Writer
Jul 2024 – Jan 2025Geeks For Geeks
- Authored 20+ technical articles on Git, JavaScript, and System Design, collectively reaching 8,000+ page views.
Education
Bachelor of Science in Computer Science
Expected: 2026Uka Tarsadia UniversitySurat, Gujarat