Software Engineer - AI / ML

DevsincLahore, Punjab, PK

14 days ago

Job type

Quick Apply

Job description

We’re hiring a hands-on AI Learning Engineer who can build and fine-tune generative AI (diffusion & LLMs ), vision-language models (VLMs), classical & deep models from scratch, and productionize them end-to-end.

This role blends modeling (you’ll train and fine-tune models) with production systems (MLOps, LLMops, model optimization, serving, and API / backends).

You will not only use pre-trained models, you will design, train, optimize, and serve custom models for production use (GenAI, Stable Diffusion, OCR, theft detection, recommenders, etc.).

Requirements

Develop production inference stacks : convert & optimize models (Torch → ONNX → TensorRT when appropriate), quantize / prune, profile FLOPs and latency, and deliver low-latency GPU inference with minimal accuracy loss.

Create robust model serving infrastructure : FastAPI / gRPC services for inference, streaming outputs (token-level streaming for LLMs, frame / segment streaming for CV), model versioning and routing, autoscaling, model rollback and A / B testing.
Build CV solutions from scratch : object detection, theft / theft-detection pipelines, OCR (document parsing, structured extraction), surveillance analytics, and integrate + finetune Hugging Face pretrained models when beneficial.
Fine-tune Stable Diffusion and other generative image models for brand / style-consistent image generation and downstream tasks.
Train and fine-tune VLMs (vision-language models) for multimodal tasks (captioning, visual QA, multimodal retrieval), using both from-scratch training and transfer learning from HF checkpoints.
Design, train & fine-tune GenAI models (LLMs) for use cases such as conversational agents, summarization, retrieval-augmented generation (RAG), and domain adaptation.
MLOps / LLMops / AIOps : CI / CD for training & deployment, dataset versioning, experiments tracking, model registry, monitoring (latency, throughput, model drift, data drift), alerting and automated retraining pipelines.
Data acquisition & pipeline work : build scrapers / collectors and scalable ingestion pipelines; implement proxy pools, rate limit handling, and rotation for reliability (with compliance & respect for target site terms).
Third-party model integration : call and compose third-party inference APIs (Hugging Face, OpenAI, other vendors), build fallback & hybrid inference strategies that combine local and cloud models.

Required qualifications :

Strong experience with computer vision : object detection, segmentation, OCR pipelines (training from scratch and transfer learning).

Deep knowledge of model optimization : quantization, pruning, distillation, FLOPs analysis, CUDA profiling, mixed precision (AMP), and inference time tradeoffs.

Demonstrated ability to design & implement models from scratch (not only using pretrained checkpoints) : architecture design, loss selection, training loops, evaluation metrics.

Practical experience training and fine-tuning LLMs (transformers) and generative image models (Stable Diffusion or diffusion frameworks).

Experience exporting & running models with ONNX, TensorRT, TorchScript, and familiarity with Triton, TorchServe, or ONNX Runtime for production serving.

Hands-on with GPU infrastructure and CUDA (profiling with nvprof / nsight, memory management, multi-GPU training).

Solid backend engineering skills : Python, FastAPI (or Flask), asynchronous programming, WebSockets / SSE, REST design.

Containerization and orchestration : Docker, Kubernetes, Helm, and experience deploying GPU workloads to AWS / GCP / Azure or on-prem.

Good understanding of classical ML (scikit-learn) : regression, classification, clustering; able to design experiments and baselines.

Strong software engineering practices : unit tests, CI / CD, code reviews, reproducibility.

Excellent communication skills, able to explain ML tradeoffs to product and frontend teams.

Preferred / Nice-to-have :

Knowledge of privacy-preserving ML (DP, federated learning) or regulatory constraints for data handling.

Experience with logging & observability : Prometheus, Grafana, Sentry, OpenTelemetry.

Create a job alert for this search

Software Engineer • Lahore, Punjab, PK