Data Engineer Confirmé F/H
Role details
Job location
Tech stack
Job description
Within the SOCOTEC Data & AI Hub, you will join a multidisciplinary team responsible for designing, deploying, and maintaining the group's Data architecture on an international scale.
You will contribute to the modernization of the SOCOTEC Lakehouse, the core of the global analytics platform, and participate in concrete data valorization projects, from design to data visualization.
You will work on three main missions:
- Develop end-to-end data pipelines (ingestion, transformation, modeling, exposure) and contribute to the implementation of visualizations in Power BI or Databricks SQL.
- Continuously improve the SOCOTEC Lakehouse, particularly in the areas of governance, quality, and data pseudonymization.
- Experiment with generative AI solutions applied to data, such as Databricks GenIE, to transform text queries into actionable insights.
The position requires perfect command of English for collaboration with our US teams.
The technical stack used:
- Amazon Web Services (AWS)
- Databricks
- Fivetran
- Spark for ETL pipelines
- GitLab for source versioning
- S3
- Power BI, the BI tool, managed with the BI teams
At SOCOTEC, careers are built with you towards what suits you best: technical expertise, team management (data lead), etc.
You will have the opportunity to interact internationally (US, UK, ITA, ESP, NL), and international mobility opportunities are possible.
Requirements
- Master's degree in Big Data, Computer Science, or Software Engineering with a strong specialization or appetite for data and distributed architectures.
- At least 3 years of experience in Data Engineering.
- Strong mastery of SQL and NoSQL databases (modeling, optimized queries, integrity, and performance).
- Good understanding of Big Data architectures and distributed processing tools (Spark, Hadoop, Airflow, Kafka, Delta Lake, etc.).
- Prior experience with Databricks would be appreciated.
- Experience with collaborative development environments: Git, GitLab, Jupyter Notebooks, VS Code.
- Valued knowledge of AWS cloud services.
- Familiarity with ETL/ELT principles, Data Lakehouse, and DataOps (CI/CD, monitoring, data quality).
- Interest in emerging technologies, particularly Generative AI and its integration into Data platforms.
- Team spirit, rigor, and sense of collaboration in an agile environment.
- Technical curiosity and ability to quickly learn new tools and paradigms.
- Autonomy, service-oriented mindset, and taste for solving complex problems.
- Bilingual/native English level required.