‹ Back to the main page

Senior Data Scientist

Status: In Search

We are looking for a Senior Data Scientist to own the methodology layer of an enterprise AI product.

About the project: The project focuses on building an AI-powered Intelligence Engine on top of an existing optimization platform that collects large-scale session and behavioral data across multiple brands.

The goal is to enhance how this data is used by introducing a system that can analyze patterns, generate insights, and provide actionable recommendations to improve customer experience. The solution combines automated workflows (agents) that process data and apply predefined logic, along with interactive dashboards that visualize both insights and underlying performance metrics.

Overall, the project aims to transform a largely manual, expert-driven optimization process into a more scalable, data-driven system that enables consistent and efficient decision-making.

Start Date: in 2 weeks

Employment type: 1 FTE for 2 months, next 4 months 0.5 FTE

Project Duration: 6 months

Location: Europe, Ukraine (remote)

Language: English B2 (upper-intermediate)

Requirements:

Applied statistics (senior depth):
- 1. - Multiple testing correction — Bonferroni, Benjamini-Hochberg, FDR
    - Meta-analysis methods — fixed-effects, random-effects, heterogeneity handling
    - Experimental design — sequential testing, power analysis, variance reduction
    - Selection bias and publication bias — recognition and mitigation
    - Bayesian methods at working fluency — priors, posteriors, Bayesian A/B testing

Pattern discovery and causal inference:
- 1. - Clustering and peer grouping — k-means or hierarchical for tenant segmentation
    - Similarity-based retrieval — nearest-neighbors for cross-customer pattern matching
    - Uplift modeling basics — treatment effect estimation (ATE, CATE) for recommendation scoring
    - Cold-start handling — Bayesian priors, hierarchical models for low-data tenants
Time-series analysis & behavioral pattern recognition:
- 1. - Time-series decomposition and trend detection over experiment lifecycle data
    - Behavioral sequence analysis — identifying recurring customer interaction patterns across experience types
    - Temporal aggregation strategies for cross-customer comparison (controlling for seasonality, campaign cycles)
    - Anomaly detection in experiment performance over time (sudden drops, delayed effects)
    - Experience lifecycle patterns — ramp-up, plateau, decay detection
Python data science stack:
- 1. - pandas, numpy, scikit-learn, statsmodels, scipy — fluent, production-level
    - Jupyter + Markdown — notebook-first; this is the deliverable format
SQL:
- 1. - Complex queries directly against Snowflake or similar cloud warehouses
    - Window functions and analytical SQL
Retrieval evaluation (RAG):
- 1. - Retrieval metrics — recall@k, MRR, nDCG
    - Golden-set design — constructing evaluation datasets
    - Hands-on RAG eval experience — Ragas, DeepEval, or custom harnesses
    - Embedding quality evaluation — similarity distribution analysis, retrieval failure diagnosis
    - Corpus-specific retrieval challenges — chunking strategy for structured experimentation data, hybrid search (keyword + semantic)
Communication and spec-writing:
- 1. - Writing methodology documents engineers can implement without reinterpretation
    - Translating technical work for non-technical audiences
    - Client-facing discovery — comfortable leading SME conversations
    - Presenting own work under expert pushback

Responsibilities:

Design the pattern-detection methodology for cross-customer intelligence mining
Define peer-group taxonomy and benchmark calibration (with client SMEs)
Specify statistical metadata preservation rules for cross-customer outcome comparison
Design the confidence-scoring methodology for AI-generated suggestions
Design retrieval-quality evaluation methodology for RAG (jointly with AI/LLM Engineer)
Handle cross-customer statistical challenges: multiple testing, publication bias, treatment-effect heterogeneity
Write methodology specifications the Data Engineer can implement without reinterpretation
Participate in client-facing discovery sessions (~2 hrs/week) on methodology decisions
Prototype in Jupyter + Markdown; hand off to engineering for productionization

Project Technology Stack:

Python (pandas, numpy, scipy, statsmodels, scikit-learn)
Jupyter Notebooks + Markdown (core deliverable format)
SQL (Snowflake or similar cloud data warehouse, strong analytical querying)

Process Flow:

HR pre-screen + English check (0.5 h)
Professional interview (1 h)
Intro call with Project Manager (0.5 h)

Telegram:

Svitlana Lopatovska

Email:

svitlana.lopatovska@masterofcode.com

Page updated

Google Sites

Report abuse