D

PhD GenAI Research Scientist Intern

Databricks
Internship
On-site
San Francisco, California, United States

Company Description:

At Databricks, we are obsessed with enabling data teams to solve the world’s toughest problems, from security threat detection to cancer drug development. We do this by building and running the world’s best data and AI platform, so our customers can focus on the high value challenges that are central to their own missions.

The Mosaic AI organization enables companies to develop AI models and systems using their own data, with technologies ranging from fine-tuning LLMs for enterprise domains, to a platform for building compound AI systems that use retrieval and agents. Mosaic AI is committed to the belief that a company’s AI models are just as valuable as any other core IP, and that high-quality AI models should be available to all.

Job description:

Most of the world's data+AI problems lie in enterprise domains, behind closed doors. Our research team's goal is to push the frontier of "domain adaptation" - how can we develop LLMs and AI systems that work well for custom domains. To do this we are tackling open research problems on a range of topics, from how to scale/automate eval, fine tune with synthetic data, retrieval augmentation, fast/efficient inference and more. 

You will work with our research team on projects focused on adapting LLMs and AI systems towards enterprise domains. This may include:

  • Adapting, improving, and evaluating a method from the literature.
  • Designing an entirely new method for domain adaptation.
  • Composing together multiple methods to create new recipes for efficient post-training.
  • Evaluation of LLMs and AI systems. 

Your qualifications and qualities:

  •  Required:
    • Research experience in and proficiency with the fundamentals of deep learning.
    • Pursuing a PhD in computer science or related fields (electrical engineering, neuroscience, physics, math, etc.).
    • Proficient software engineering skills, including with PyTorch.