Master Thesis, Semester project or Internship
Publication date:
09 October 2024Workload:
100%- Place of work:Zurich
Master Thesis, Semester project or Internship
Composable Data Management Systems
Ref. 2024_016
Project description
The IBM Research Laboratory is located just 40 minutes away from ETH Zurich. This creates a fantastic opportunity for highly motivated ETH Students to join our unique research-corporate environment for a Master Thesis, Semester Project or Internship.
Composable Data Management Systems are a relatively new research area which can be seen as an evolution of today’s data lakehouse architecture where storage and data have been decoupled thanks to open data formats. We are now moving towards on standardizing the other components of a data management system to make workloads more portable and helping users to find the best workload-engine-match under cost-performance constraints. One key open-source project in this area is Substrait which defines an open serialization format for query plans which is already supported by several engines. To achieve this portability of workloads we are seeing the following challenges that we would like to address:
- cross-engine query optimization considering the different Substrait capabilities supported by an engine
- cross-engine query optimization using learned performance models of engines to assign queries to engines
- framework for defining portable user defined functions that can be securely executed within any engine
- extensions of the Substrait specification to cover non-relational operations that are common in defining machine learning/AI data pipelines
As part of our team, you will collaborate with experienced Research Scientists and AI Software Engineers that will lead and help you to successfully complete the challenges of the proposed task. The technology created in our team is powering IBM mainstream products, in particular watsonx.data.
Minimum qualifications
- Bachelor’s degree in computer science or machine learning, including equivalent practical experience
- Experience with databases / data management systems, SQL, relational algebra, query plans
- Team player, self-motivated with a passion for technology and innovation
Preferred qualifications
- Experience with database optimization, machine learning/AI
- 3+ years of proved programming experience in Python or Java
- Experience with Substrait, Apache Calcite, Apache Arrow, WASM
- Independent worker with the ability to effectively operate with flexibility in a fast-paced, constantly evolving team environment
Diversity
IBM is committed to diversity at the workplace. With us you will find an open, multicultural environment. Excellent flexible working arrangements enable all genders to strike the desired balance between their professional development and their personal lives.
How to apply
Please submit your application through the link below. This position is available starting immediately or at a later date.
Contact
IBM Research GmbH