We are looking for a Senior Data/ETL Engineer for our US-based client’s project to join a world-class team to tackle one of the industry’s most critical infrastructure problems. They’re building a multi-tenant, AI-native platform where enterprise data becomes actionable through semantic enrichment, intelligent agents, and governed interoperability. At the heart of this architecture lies our Data Fabric — an intelligent, governed layer that turns fragmented and siloed data into a connected ontology ready for model training, vector search, and insight-to-action workflows.
Project – an AI-native manufacturing & supply-chain platform that unifies data across pre-production, production and post-production systems to provide real-time insights and autonomous actions for complex manufacturing customers (automotive and transportation OEMs). The aim is to leverage agents, RAG, and knowledge graphs to solve tariffs, supply chain rebalancing, costs and compliance.
Experience / Skills required:
Must have:
5+ years building large-scale data infrastructure in production environments Strong programming skills in Golang Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect) Comfortable processing unstructured data formats: PDFs, Excel, emails, logs, CSVs, web APIs Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet) Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph) Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers Experience implementing data governance: RBAC, ABAC, data contracts, lineage, data quality checks Upper-Intermediate English and betterGood to have:
Prior work with vector DBs (e.g. Weaviate, Qdrant, Pinecone) and embedding pipelines Experience building or contributing to enterprise connector ecosystems Knowledge of ontology versioning, graph diffing, or semantic schema alignment Familiarity with data fabric patterns (e.g. Palantir Ontology, Linked Data, W3C standards) Familiar with fine-tuning LLMs or enabling RAG pipelines using enterprise knowledge Experience enforcing data access policy with tools like OPA, Keycloak, Snowflake row-level securityResponsibilities:
Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.) Design and maintain the data fabric layer — including a knowledge graph (Neo4j or Puppygraph) enriched with ontologies, metadata, and relationships Normalize and vectorize data for downstream AI/LLM workflows — enabling retrieval-augmented generation (RAG), summarization, and alerting Create and manage data contracts, access layers, lineage, and governance mechanisms Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data Collaborate with ML/LLM teams to feed high-quality enterprise data into model training and tuning pipelinesWe offer:
Competitive salary with the regular review Vacation (up to 20 working days) Paid sick leaves (10 working days) National Holidays as paid time off Flexible working schedule, remote format Direct cooperation with the customer Dynamic environment with low level of bureaucracy and great team spirit Challenging projects in diverse business domains and a variety of tech stacks Communication with Top/Senior level specialists to strengthen your hard skills Online teambuildings Your name Your email Subject Your message (optional)