We are looking for a Senior Data Engineer to own the methodology layer of an enterprise AI product.
About the project: The project focuses on building an AI-powered Intelligence Engine on top of an existing optimization platform that collects large-scale session and behavioral data across multiple brands.
The goal is to enhance how this data is used by introducing a system that can analyze patterns, generate insights, and provide actionable recommendations to improve customer experience. The solution combines automated workflows (agents) that process data and apply predefined logic, along with interactive dashboards that visualize both insights and underlying performance metrics.
Overall, the project aims to transform a largely manual, expert-driven optimization process into a more scalable, data-driven system that enables consistent and efficient decision-making.
Start Date: in 2 weeks
Employment type: 1 FTE for 2 months, next 4 months 0.5 FTE
Project Duration: 6 months
Location: Europe, Ukraine (remote)
Language: English B2 (upper-intermediate)
Requirements:
Snowflake (senior depth):
Row-access policies and masking policies — authoring, testing, debugging in production
VARIANT / semi-structured data — FLATTEN, LATERAL joins, JSON path access
Streams and Tasks — change data capture and scheduled jobs
Clustering keys, Time Travel, zero-copy cloning, Resource Monitors
Snowpark Python — working knowledge
dbt Core (production depth):
Incremental models — merge/append/insert_overwrite strategies, late-arriving data handling
Snapshots
Tests — built-in, custom SQL, generic; authoring discipline
Sources, freshness SLAs, exposures
dbt docs — column-level documentation discipline
CI integration — dbt build on PR, state-based selection
Data modeling:
Dimensional modeling — star schema, fact/dimension design
Slowly Changing Dimensions
Event-sourced / append-only patterns
Multi-tenant modeling — row-level security, tenant-aware aggregation, k-anonymity / suppression patterns
SQL (advanced):
Window functions including frame specifications
Recursive CTEs for hierarchy traversal
Query plan reading and performance tuning
Complex JSON / VARIANT operations at scale
Multi-tenant SaaS data architecture:
Production experience with shared-schema, row-level-isolated multi-tenant platforms
Aggregation safety — knowing when cross-tenant aggregates are safe to expose
Suppression thresholds / k-anonymity patterns
Engineering basics:
Python — strong working knowledge
Responsibilities:
Design the Snowflake data model end-to-end (raw → normalized → marts layers)
Build transformations in dbt Core with incremental models, tests, and CI integration
Author row-access policies for multi-tenant data isolation (200+ enterprise customers with parent/child account hierarchies)
Stand up Snowflake Cortex Search for semantic retrieval over the corpus
Handle structured and semi-structured data (VARIANT columns for configuration blobs, rules, and action content)
Implement daily incremental refresh with proper handling of late-arriving data
Build the dbt test suite — schema tests, custom business-logic tests, freshness checks
Validate anonymization boundaries and implement k-anonymity / suppression for peer benchmarks
Project Technology Stack:
Python (pandas, numpy, scipy, statsmodels, scikit-learn)
Jupyter Notebooks + Markdown (core deliverable format)
SQL (Snowflake or similar cloud data warehouse, strong analytical querying)
Process Flow:
HR pre-screen + English check (0.5 h)
Professional interview (1 h)
Intro call with Project Manager (0.5 h)