Data Scientist (Gen AI)

We’re seeking a highly skilled, hands-on Data Scientist with 4–10 years of experience in applied AI/ML to join our fast-paced team. This role requires deep expertise in transformer architectures and strong fundamentals in model training, fine-tuning, and optimization. You’ll work across modalities (text, audio, video), with the flexibility to specialize in one domain but the adaptability to experiment across others.

The ideal candidate thrives in a startup-style, high-velocity R&D environment, is execution-focused, and demonstrates ownership from architecture to deployment. You’ll run rapid experiments, iterate on state-of-the-art models, and push the boundaries of generative AI in lip-sync, character consistency, audio realism, and video quality — with a research-first, problem-solving mindset.

Remote

Full-time

Apply Now

Responsibilities —

Model Development & Fine-tuning: Run end-to-end experiments on transformer-based architectures (LLMs, Whisper, diffusion, LoRA, RLHF/SFT, multimodal models).
Domain-Specific Applications:
- Audio: Lip-sync, emotional delivery (shouting, whispering, crying), regional language support.
- Video: Scene/character consistency, quality benchmarks comparable to Veo3/Sora.
- Text: Extend LLMs to handle regional languages and domain-specific adaptation.
Evaluation & Optimization: Design automated evaluation frameworks for objective quality scoring (images, video frames, audio clips). Balance trade-offs in speed, quality, and efficiency.
Cross-Modality Integration: Experiment with audio-video synchronization, background score integration, and text-to-video alignment.
Research & Experimentation: Stay ahead of rapidly evolving models and tools, testing architectural variations and scaling solutions for production use.
Ownership & Execution: Drive initiatives independently with strong problem-solving, accountability, and first-principles thinking.

‍

Requirements —

Experience: 4–10 years in applied Data Science/ML with a strong focus on generative AI.
Core Fundamentals: Solid grasp of transformer architectures, LLMs, training dynamics, and optimization techniques.
Modality Depth: Expertise in at least one modality (text, audio, or video), with demonstrable end-to-end project experience.
Hands-On Skills: Strong coding and debugging ability in Python, with deep learning frameworks (PyTorch, TensorFlow).
Deployment Knowledge: Experience with ML pipelines (FastAPI or similar) for inference and deployment.
Evaluation Metrics: Proven ability to design/implement automated evaluation methods for generative outputs.
Adaptability: Ability to experiment quickly with new tools, libraries, and models in a dynamic environment.

‍

Benefits —

What you get —

Best in class salary: We hire only the best, and we pay accordingly.
Proximity Talks: Meet other designers, engineers, and product geeks — and learn from experts in the field.
Keep on learning with a world-class team: Work with the best in the field, challenge yourself constantly, and learn something new every day.

‍

About us —

We are Proximity — a global team of coders, designers, product managers, geeks, and experts. We solve complex problems and build cutting edge tech, at scale. Our team of Proxonauts is growing quickly, which means your impact on the company’s success will be huge. You’ll have the chance to work with experienced leaders who have built and led multiple tech, product and design teams. Here’s a quick guide to getting to know us better:

Watch our CEO, Hardik Jagda, tell you all about Proximity.
Read about Proximity’s values and meet some of our Proxonauts here.
Explore our website, blog, and the design wing — Studio Proximity.
Get behind-the-scenes with us on Instagram! Follow @ProxWrks and @H.Jagda.