Talent.com
Software Engineer - AI / ML

Software Engineer - AI / ML

DevsincLahore, Punjab, PK
14 days ago
Job type
  • Quick Apply
Job description

We’re hiring a hands-on AI Learning Engineer who can build and fine-tune generative AI (diffusion & LLMs ), vision-language models (VLMs), classical & deep models from scratch, and productionize them end-to-end.

This role blends modeling (you’ll train and fine-tune models) with production systems (MLOps, LLMops, model optimization, serving, and API / backends).

You will not only use pre-trained models, you will design, train, optimize, and serve custom models for production use (GenAI, Stable Diffusion, OCR, theft detection, recommenders, etc.).

Requirements

Develop production inference stacks : convert & optimize models (Torch → ONNX → TensorRT when appropriate), quantize / prune, profile FLOPs and latency, and deliver low-latency GPU inference with minimal accuracy loss.

  • Create robust model serving infrastructure : FastAPI / gRPC services for inference, streaming outputs (token-level streaming for LLMs, frame / segment streaming for CV), model versioning and routing, autoscaling, model rollback and A / B testing.
  • Build CV solutions from scratch : object detection, theft / theft-detection pipelines, OCR (document parsing, structured extraction), surveillance analytics, and integrate + finetune Hugging Face pretrained models when beneficial.
  • Fine-tune Stable Diffusion and other generative image models for brand / style-consistent image generation and downstream tasks.
  • Train and fine-tune VLMs (vision-language models) for multimodal tasks (captioning, visual QA, multimodal retrieval), using both from-scratch training and transfer learning from HF checkpoints.
  • Design, train & fine-tune GenAI models (LLMs) for use cases such as conversational agents, summarization, retrieval-augmented generation (RAG), and domain adaptation.
  • MLOps / LLMops / AIOps : CI / CD for training & deployment, dataset versioning, experiments tracking, model registry, monitoring (latency, throughput, model drift, data drift), alerting and automated retraining pipelines.
  • Data acquisition & pipeline work : build scrapers / collectors and scalable ingestion pipelines; implement proxy pools, rate limit handling, and rotation for reliability (with compliance & respect for target site terms).
  • Third-party model integration : call and compose third-party inference APIs (Hugging Face, OpenAI, other vendors), build fallback & hybrid inference strategies that combine local and cloud models.

Required qualifications :

  • Strong experience with computer vision : object detection, segmentation, OCR pipelines (training from scratch and transfer learning).
  • Deep knowledge of model optimization : quantization, pruning, distillation, FLOPs analysis, CUDA profiling, mixed precision (AMP), and inference time tradeoffs.
  • Demonstrated ability to design & implement models from scratch (not only using pretrained checkpoints) : architecture design, loss selection, training loops, evaluation metrics.
  • Practical experience training and fine-tuning LLMs (transformers) and generative image models (Stable Diffusion or diffusion frameworks).
  • Experience exporting & running models with ONNX, TensorRT, TorchScript, and familiarity with Triton, TorchServe, or ONNX Runtime for production serving.
  • Hands-on with GPU infrastructure and CUDA (profiling with nvprof / nsight, memory management, multi-GPU training).
  • Solid backend engineering skills : Python, FastAPI (or Flask), asynchronous programming, WebSockets / SSE, REST design.
  • Containerization and orchestration : Docker, Kubernetes, Helm, and experience deploying GPU workloads to AWS / GCP / Azure or on-prem.
  • Good understanding of classical ML (scikit-learn) : regression, classification, clustering; able to design experiments and baselines.
  • Strong software engineering practices : unit tests, CI / CD, code reviews, reproducibility.
  • Excellent communication skills, able to explain ML tradeoffs to product and frontend teams.
  • Preferred / Nice-to-have :

  • Knowledge of privacy-preserving ML (DP, federated learning) or regulatory constraints for data handling.
  • Experience with logging & observability : Prometheus, Grafana, Sentry, OpenTelemetry.
  • Create a job alert for this search

    Software Engineer • Lahore, Punjab, PK