Machine Learning Researcher, Multimodal Foundation Models - Jobs bei Apple (CH)
Date de publication :
27 décembre 2024Taux d'activité :
100%Type de contrat :
Durée indéterminée- Lieu de travail :Zurich, Zurich
Machine Learning Researcher, Multimodal Foundation Models
Zurich, Zurich, Switzerland
Hardware
Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Multifaceted, amazing people and inspiring, innovative technologies are the norm here. The people who work here have reinvented entire industries with all Apple Hardware products. The same passion for innovation that goes into our products also applies to our practices, strengthening our commitment to leave the world better than we found it. Join us in this truly exciting era of Artificial Intelligence to help deliver the next groundbreaking Apple products & experiences! We are continuously advancing the state of the art in Computer Vision and Machine Learning, touching all aspects of language and multimodal foundation models, from data collection, data curation to modeling, evaluation and deployment. As a member of our dynamic group, you will have the unique and rewarding opportunity to craft upcoming research directions in the field of multimodal foundation models that will inspire future Apple products. You will be working alongside highly accomplished and deeply technical scientists and engineers to develop state of the art solutions for challenging problems. This is a unique opportunity to be part of what forms the future of Apple products that will touch the lives of many people. We (Multimodal Intelligence Team) are looking for a machine learning researcher to work on the field of Generative AI and multimodal foundation models. Our team has an established track record of shipping features that leverage multiple sensors, such as FaceID, RoomPlan and hand tracking in VisionPro, as well as a strong research presence in the multimodal AI community. Our publications span multimodal pre-training, vision-language models, video-language models, and multimodal alignment. We are focused on building experiences that demonstrate the power of our sensing hardware as well as large foundation models.
Description
This position requires a highly motivated person who wants to help us advance the field of generative AI and multimodal foundation models. You will be responsible for designing, implementing, and evaluating foundation models based on the latest advancements in the fields, taking into account future hardware design and product needs. In addition, you will have an opportunity to engage and collaborate with several teams across Apple to deliver the best products.
Minimum Qualifications
- Strong experience in deep learning with demonstrated work in at least one area of multimodal systems (e.g. vision, language, video, etc.).
- Proficiency in Python and in a modern deep learning framework such as PyTorch or JAX.
- Ability to work in a collaborative environment.
- Ability to communicate the results of analyses in a clear and effective manner.
- BS and a minimum of 3 years relevant industry experience.
Key Qualifications
Preferred Qualifications
- PhD, or equivalent practical experience, in Computer Science, Computer Vision, Machine Learning, or related technical field.
- Track record of impactful research published at top ML conferences (CVPR, ICCV/ECCV, NeurIPS, ICML, ICLR, etc.).
- Deep expertise in multimodal foundation models.
- Strong research experience in at least one major area of model development (data curation, pre-training, fine-tuning, alignment, or evaluation), particularly as it applies to multimodal systems.
- Experience with large-scale training pipelines, including working with large datasets and scaling models across distributed systems.
- Ability to work independently and drive research projects from conception to completion.