Data Analyst Portfolio

Analytics that answer
the question behind the question.

4+ years embedded in logistics and SaaS analytics. Six in-depth case studies covering real business problems, structured analytical approaches, working code, and measurable commercial outcomes.

4+
Years experience
$300K+
AUD revenue product owned
3
Countries worked across
30+
Live dashboards built
Work
Six case studies
Problem → approach → analysis methodology → findings → business recommendation.
01
SQL · Snowflake dbt Looker
Why are 23% of deliveries missing SLA? A carrier performance deep-dive across 200K monthly shipments.
Logistics SaaS · APAC carrier network · 12 carrier partners · 200K+ monthly shipments · enterprise customer SLA penalties at risk
+12%
SLA adherence improvement
−30%
reporting time saved
Business Problem
Operations leadership flagged a 23% SLA miss rate but had no visibility into which carriers, lanes, or parcel types were responsible. Every missed SLA triggered penalty clauses and damaged enterprise customer trust. Existing reporting showed only aggregate on-time rates, which was too blunt to act on. The question was: where exactly is this failing, and is it fixable without renegotiating all 12 carrier contracts?
Analytical Approach
Built a shipment-level SLA model in Snowflake joining delivery events to contracted SLA deadlines and carrier metadata. Segmented failures across four dimensions simultaneously: carrier × region × weight band × day-of-week. Used P95 transit time alongside miss rate to distinguish systemic underperformance from occasional spikes. Automated daily refresh via a dbt model and surfaced it as a real-time Looker dashboard for ops leadership.
Key Findings
Two carriers drove 71% of all SLA breaches, both concentrated in last-mile delivery within a single metro region on Fridays. Weight bands above 10kg had a 41% miss rate vs. 9% for sub-5kg parcels. The problem was structurally concentrated, not systemic.
Carrier A metro
78% miss
Carrier B regional
61% miss
Carrier C
24% miss
Carrier D
9% miss
Recommendation
Presented to ops and commercial leads: reroute heavy parcels (>10kg) away from Carrier A in metro on Fridays and trigger a contractual SLA review with Carrier B. Routing change implemented in 3 weeks. 90-day follow-up confirmed a 12% overall SLA improvement and reduced penalty-clause payouts. The Looker dashboard became the standard weekly ops review fixture used by leadership every Monday.
SLA segmentation model: multi-dimensional failure analysis
SQL · Snowflake
WITH shipment_sla AS (
  SELECT
    s.shipment_id,
    s.carrier_id,
    s.region,
    WIDTH_BUCKET(s.weight_kg, 0, 30, 6)       AS weight_band,
    DAYNAME(s.shipped_at)                     AS day_of_week,
    DATEDIFF('hour', s.shipped_at,
             s.delivered_at)                 AS transit_hours,
    c.sla_hours,
    CASE WHEN s.delivered_at <=
              DATEADD('hour', c.sla_hours, s.shipped_at)
         THEN 1 ELSE 0 END                    AS met_sla
  FROM shipments s
  JOIN carrier_sla_contracts c
    ON  s.carrier_id = c.carrier_id
   AND  s.region    = c.region
  WHERE s.shipped_at >= DATEADD(day, -90, CURRENT_DATE())
    AND s.status    = 'delivered'
),
-- P95 transit time surfaces chronic underperformers vs. one-off spikes
segmented AS (
  SELECT
    carrier_id, region, weight_band, day_of_week,
    COUNT(*)                                    AS shipments,
    ROUND(1 - AVG(met_sla), 3)                AS miss_rate,
    AVG(transit_hours)                          AS avg_transit_h,
    PERCENTILE_CONT(0.95) WITHIN GROUP
      (ORDER BY transit_hours)                  AS p95_transit_h,
    SUM(1 - met_sla)                           AS total_missed
  FROM  shipment_sla
  GROUP BY 1,2,3,4
  HAVING shipments > 50
)
SELECT
  *,
  -- Contribution to overall miss volume, not just local rate
  ROUND(total_missed / SUM(total_missed) OVER () * 100, 1)
    AS pct_of_total_misses
FROM  segmented
ORDER BY total_missed DESC
02
Python · sklearn SQL · Snowflake Deepnote
Predicting B2B SaaS churn 8 weeks before the renewal conversation and intervening at scale.
SaaS platform · 4,200 enterprise accounts · 12 months of product usage, billing, and support data · 8.4% monthly churn rate · 3-person CS team unable to monitor all accounts
0.81
AUC-ROC score
~$200K
est. ARR retained
Business Problem
The CS team was losing accounts they never saw coming. By the time a customer raised cancellation intent, the relationship was already broken. With 4,200 accounts and only 3 CS managers, reactive outreach didn't scale. The business needed a weekly risk score per account so the team could prioritise their time on accounts most likely to churn rather than accounts that were already churning.
Analytical Approach
Pulled 12 months of product usage logs, billing events, and support tickets from Snowflake. Ran correlation and exploratory analysis to identify leading vs. lagging indicators, which was a critical distinction. Feature adoption breadth was 8 weeks ahead of churn signal; login frequency was a lagging indicator that appeared after the decision was already made. Used time-based cross-validation (not random split) to prevent data leakage across temporal cohorts.
Key Findings
Feature adoption breadth was the single strongest predictor. Accounts using fewer than 3 core modules had 4.2× higher churn probability. Billing failures were the second strongest signal. Login frequency, which was the metric the CS team had been tracking manually, was actually the weakest predictor.
Feature adoption
coeff 0.91
Billing failures
coeff 0.73
Support tickets
coeff 0.55
Days since login
coeff 0.31
Recommendation
Recommended a feature adoption campaign for accounts below 3 modules, specifically an in-app onboarding checklist and a CS call at day 45. Weekly risk scores were embedded into the CS team's Deepnote report, ranked by tier. Within two quarters the team attributed approximately $200K ARR saved to early intervention. The CS team stopped tracking login frequency and started tracking adoption breadth, which became a permanent change in how they measured account health.
Churn prediction pipeline: feature engineering and time-safe cross-validation
Python · sklearn · pandas
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import roc_auc_score, classification_report

# ── Feature engineering ───────────────────────────────────────
def build_features(usage_df, billing_df, support_df) -> pd.DataFrame:
    features = usage_df.groupby('account_id').agg(
        features_adopted = ('feature_key', 'nunique'),
        days_since_login  = ('last_login_date',
                             lambda x: (pd.Timestamp.today()
                                        - x.max()).days),
        sessions_30d      = ('session_id', 'count')
    )
    billing_feats = billing_df.groupby('account_id').agg(
        billing_failures  = ('status',
                             lambda x: (x == 'failed').sum())
    )
    support_feats = support_df.groupby('account_id').agg(
        support_tickets_30d = ('ticket_id', 'count')
    )
    return features.join([billing_feats, support_feats], how='left').fillna(0)

# ── Model pipeline ────────────────────────────────────────────
FEATURES = ['features_adopted', 'billing_failures',
            'support_tickets_30d', 'days_since_login']

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('model',  LogisticRegression(
        C=0.5, class_weight='balanced', max_iter=1000
    ))
])

# TimeSeriesSplit prevents future data leaking into training window
tscv   = TimeSeriesSplit(n_splits=5)
scores = []
for tr, te in tscv.split(X):
    pipeline.fit(X.iloc[tr], y.iloc[tr])
    p = pipeline.predict_proba(X.iloc[te])[:, 1]
    scores.append(roc_auc_score(y.iloc[te], p))
print(f"Mean AUC: {np.mean(scores):.3f} ± {np.std(scores):.3f}")

# ── Weekly scoring output for CS team ───────────────────────
df['churn_prob'] = pipeline.predict_proba(df[FEATURES])[:, 1]
df['risk_tier']  = pd.cut(df['churn_prob'],
    bins=[0, .30, .60, 1],
    labels=['low', 'medium', 'high'])
# Export top 50 high-risk accounts to Deepnote report
df[df['risk_tier'] == 'high'].sort_values(
    'churn_prob', ascending=False).head(50)
03
SQL · Snowflake Metabase
Month-2 retention at 19% against a 35% industry benchmark: diagnosing why customers don't come back.
Southeast Asian e-commerce marketplace · 240K registered customers · 18-month cohort window · strong acquisition growth masking a retention crisis
+15%
M2 retention uplift
Redirected
product investment decision
Business Problem
Revenue was plateauing despite an 18% YoY increase in new customers. Leadership's hypothesis was that the product needed more SKUs. The data told a different story. Month-2 retention was sitting at 19% against a 35% category benchmark, meaning 4 in every 5 customers acquired never came back for a second order. Spending more on acquisition while retention was broken was effectively filling a leaking bucket.
Analytical Approach
Built an 18-month cohort retention matrix in SQL, grouping customers by first-order month and tracking re-purchase behaviour across 18 subsequent periods. Cross-joined retention rates against four external variables: email re-engagement timing post-order, promotional discount usage, product category of first purchase, and delivery experience (on-time vs. late). Each variable was isolated independently before combining to avoid conflation.
Key Findings
Email timing was the dominant variable. Not content, not discounts. Customers emailed within 3–5 days of their first order showed 31% M2 retention vs. 10% for those who received no email. Late delivery had a compounding negative effect: customers who experienced late delivery had 2.3× lower M2 retention regardless of email timing.
Email day 3–5
31% M2 retention
Email day 6–9
22%
Email day 10–14
14%
No email
10%
Recommendation
Two interventions: (1) shift the first re-engagement email from day 7 to day 4 post-delivery, and (2) create a separate recovery sequence for late-delivery customers with a discount offer. A/B tested across 3 cohorts over 6 weeks. The day-4 group showed a 15% M2 retention lift. The SKU expansion budget was redirected toward delivery experience improvements. This was the outcome the analysis made obvious once the cohort view existed.
18-month cohort retention matrix + email timing cross-join
SQL · Snowflake
WITH first_orders AS (
  SELECT
    customer_id,
    DATE_TRUNC('month', MIN(order_date)) AS cohort_month,
    MIN(order_date)                         AS first_order_dt,
    MIN(delivered_at)                       AS first_delivery_dt
  FROM orders WHERE status = 'completed'
  GROUP BY customer_id
),
email_timing AS (
  -- First re-engagement email after first delivery
  SELECT
    e.customer_id,
    DATEDIFF('day', f.first_delivery_dt,
             MIN(e.sent_at))               AS days_to_first_email
  FROM email_sends e
  JOIN first_orders f USING (customer_id)
  WHERE e.sent_at > f.first_delivery_dt
    AND e.campaign_type = 're_engagement'
  GROUP BY e.customer_id, f.first_delivery_dt
),
cohort_activity AS (
  SELECT
    o.customer_id,
    f.cohort_month,
    DATEDIFF('month', f.first_order_dt,
             o.order_date)                  AS month_number
  FROM orders o
  JOIN first_orders f USING (customer_id)
  WHERE o.status = 'completed'
    AND o.order_date > f.first_order_dt
)
SELECT
  f.cohort_month,
  -- Bucket email timing for comparison
  CASE
    WHEN et.days_to_first_email BETWEEN 3 AND 5  THEN 'day_3_5'
    WHEN et.days_to_first_email BETWEEN 6 AND 9  THEN 'day_6_9'
    WHEN et.days_to_first_email >= 10           THEN 'day_10+'
    ELSE 'no_email'
  END                                           AS email_bucket,
  ca.month_number,
  COUNT(DISTINCT f.customer_id)               AS cohort_size,
  COUNT(DISTINCT ca.customer_id)              AS retained,
  ROUND(COUNT(DISTINCT ca.customer_id) * 100.0
        / NULLIF(COUNT(DISTINCT f.customer_id), 0), 1)
                                              AS retention_pct
FROM first_orders f
LEFT JOIN email_timing et  USING (customer_id)
LEFT JOIN cohort_activity ca USING (customer_id)
GROUP BY 1,2,3
ORDER BY 1,3,2
04
Python · pandas SQL Looker Studio
A billing error hiding in $2.4M of monthly freight invoices. Building an automated anomaly detection system.
3PL logistics company · 8 active trade lanes · weekly invoice processing · $2.4M monthly freight spend · 2-day manual month-end reconciliation process
$18K
billing error recovered
−2 days
reconciliation eliminated
Business Problem
Freight invoices from 3PL partners arrived weekly with hundreds of line items across 8 trade lanes. Finance was reconciling them manually at month-end, a 2-day process, with no mechanism to catch abnormal charges in real time. A carrier rate miscoding had gone undetected for 6 weeks before surfacing at quarter close. The business needed automated, real-time anomaly detection that could flag billing issues within days, not weeks.
Analytical Approach
Designed a rolling Z-score model per lane per weight band by computing a 12-week baseline cost-per-kg, then flagging any week where actual cost deviated more than 2.5σ from the baseline. Used a shifted rolling window (excluding the current week) to prevent the anomaly from contaminating its own baseline. Pipeline scheduled weekly in Deepnote via Buildkite; flagged line items pushed into a Looker Studio Monday review dashboard.
Key Findings
In week 3 of operation, the model flagged a 3.1σ cost spike on the SG→MY lane. Investigation revealed a carrier had applied a surcharge rate from a lapsed contract. The error was caught and disputed within 4 days of the invoice date, not 6 weeks. Total recovery: $18,400. Three additional minor anomalies were flagged in subsequent months.
SG→MY ⚑
+3.1σ flagged
ID→SG
+0.4σ
TH→VN
+0.2σ
MY→PH
−0.1σ
Recommendation
Embedded the pipeline into a weekly Monday finance review with email alerts above 3σ. The 2-day month-end reconciliation was retired entirely. Beyond the $18K immediate recovery, the system instilled real-time freight cost governance. Finance now disputes small overcharges that previously compounded unnoticed across quarters. Estimated annual savings from the discipline change outweigh the initial recovery.
Rolling Z-score freight anomaly detector: per lane, per weight band
Python · pandas · numpy
import pandas as pd
import numpy  as np
from dataclasses import dataclass

@dataclass
class AnomalyConfig:
    window: int          = 12    # rolling weeks for baseline
    min_periods: int     = 4     # minimum data before flagging
    sigma_threshold: float = 2.5 # standard deviations to flag
    group_cols: list     = None   # e.g. ['lane', 'weight_band']

def detect_anomalies(
    df: pd.DataFrame,
    cost_col: str,
    date_col: str,
    cfg: AnomalyConfig
) -> pd.DataFrame:
    df = df.sort_values(cfg.group_cols + [date_col])

    def rolling_stats(series: pd.Series):
        # shift(1) excludes current week from its own baseline
        shifted = series.shift(1)
        r_mean = shifted.rolling(
            cfg.window, min_periods=cfg.min_periods).mean()
        r_std  = shifted.rolling(
            cfg.window, min_periods=cfg.min_periods).std()
        return r_mean, r_std

    grp = df.groupby(cfg.group_cols)[cost_col]
    df['rolling_mean'], df['rolling_std'] = \
        zip(*grp.transform(lambda x:
            pd.DataFrame({'m': x.shift(1).rolling(
                cfg.window, min_periods=cfg.min_periods).mean(),
                's': x.shift(1).rolling(
                cfg.window, min_periods=cfg.min_periods).std()}
            )['m']  # trick: transform returns one column
        ))

    df['z_score']   = (df[cost_col] - df['rolling_mean']) \
                       / df['rolling_std']
    df['is_anomaly'] = df['z_score'].abs() > cfg.sigma_threshold
    df['deviation_pct'] = (
        (df[cost_col] - df['rolling_mean']) / df['rolling_mean']
        * 100).round(1)
    return df

cfg = AnomalyConfig(group_cols=['lane', 'weight_band'])
result = detect_anomalies(invoice_df, 'cost_per_kg', 'week', cfg)
alerts = result[result['is_anomaly']][[
    'lane', 'week', 'cost_per_kg',
    'rolling_mean', 'z_score', 'deviation_pct'
]]
05
Python · statsmodels SQL
$1.2M tied up in overstock. Building a segmented demand forecasting system for 12,000 SKUs.
FMCG retail distributor · 12,000 active SKUs · 3 years of weekly sales history · ~15% overstock rate · simultaneous stockouts during seasonal peaks
−8%
overstock reduction
−11%
seasonal stockout events
Business Problem
The buying team was placing orders on intuition and prior-year averages, which created two simultaneous problems: $1.2M in working capital locked in overstock on predictable slow movers, and persistent stockouts on seasonal items during peak periods. A single forecasting model for all 12,000 SKUs wouldn't work because demand profiles ranged from clockwork-stable to highly erratic. The challenge was to treat different SKUs differently.
Analytical Approach
Segmented all 12,000 SKUs into four demand profiles using coefficient of variation (CV) and zero-demand frequency: stable (CV < 0.3), seasonal (CV ≥ 0.3 with detectable seasonality), intermittent (>40% zero-demand weeks), and new SKUs (<26 weeks history). Each profile received its own forecasting method. Stable SKUs used linear trend regression. Seasonal SKUs used multiplicative STL decomposition. Intermittent used Croston's method. New SKUs used category-level analogue curves.
Key Findings
62% of overstock was concentrated in just 8% of SKUs, all stable-demand items where the buying team was systematically over-buffering by 20–30%. Seasonal stockouts were occurring on SKUs where demand signals were visible in the data 6 weeks before the buying team acted. The gap between signal and action was process latency, not data quality.
Stable (MAPE)
8.2%
Seasonal (MAPE)
13.4%
Intermittent
28.1%
New SKUs
34.7%
Recommendation
Recommended a tiered buying process: automate reorder triggers for stable SKUs to remove human over-buffering, and introduce a 6-week-forward seasonal alert surfaced in a weekly buying dashboard. Rolled out first to the top 500 SKUs by working capital value. Within two quarters: overstock on stable SKUs fell 8%, seasonal stockout events dropped 11%, and the buying team's weekly planning time was cut from 3 days to half a day.
SKU segmentation + per-profile forecasting pipeline
Python · statsmodels · pandas
from statsmodels.tsa.seasonal import STL
from statsmodels.tsa.holtwinters import ExponentialSmoothing
import pandas as pd
import numpy  as np

def classify_sku(series: pd.Series) -> str:
    if len(series) < 26:
        return 'new'
    cv = series.std() / series.mean() if series.mean() > 0 else 99
    zero_pct = (series == 0).mean()
    if zero_pct > 0.40: return 'intermittent'
    if cv < 0.30:     return 'stable'
    return 'seasonal'

def forecast_stable(series, horizon=12):
    # Linear trend projection for clockwork-demand SKUs
    x = np.arange(len(series))
    slope, intercept = np.polyfit(x, series, 1)
    return slope * np.arange(
        len(series), len(series) + horizon) + intercept

def forecast_seasonal(series, horizon=12):
    # STL decomposition for seasonal SKUs — separates
    # trend, seasonality, and noise cleanly
    stl  = STL(series, period=52, robust=True).fit()
    trend_slope = np.polyfit(
        range(len(stl.trend)), stl.trend, 1)
    future_trend = np.polyval(
        trend_slope, range(len(stl.trend),
                           len(stl.trend) + horizon))
    seasonal_idx = stl.seasonal[-52:-52+horizon]
    return np.maximum(future_trend + seasonal_idx, 0)

# ── Run across all SKUs ──────────────────────────────────────
results = {}
for sku_id, grp in df.groupby('sku_id'):
    s   = grp.set_index('week')['units_sold']
    seg = classify_sku(s)
    if seg == 'stable':
        fc = forecast_stable(s)
    elif seg == 'seasonal':
        fc = forecast_seasonal(s)
    else:
        fc = np.full(12, s[s > 0].mean())  # Croston approx
    results[sku_id] = {'segment': seg, 'forecast': fc}
06
dbt · Snowflake SQL Looker
Three teams claiming credit for the same revenue. Building a multi-touch attribution model that resolved it with data.
B2B SaaS · $8M ARR · sales, marketing, and partnerships all using different attribution logic · 15+ ad-hoc data requests per month from revenue leaders · no single source of truth
−15
ad-hoc requests/mo
+1 hire
partnerships headcount approved
Business Problem
Marketing, sales, and partnerships each claimed credit for the same closed-won deals, and each was technically correct under their own logic. The absence of a shared attribution model meant revenue forecasting was unreliable, budget allocation was argued rather than decided, and the data team was fielding 15+ Slack requests per month for custom cuts. Answers took 2–3 days, were inconsistent, and eroded leadership trust in analytics. The business needed a modelled, agreed definition of revenue attribution baked into the warehouse.
Analytical Approach
Before writing SQL, ran a stakeholder alignment session with heads of sales, marketing, and partnerships to agree on methodology. Landed on linear multi-touch as the default (equal credit per touch), with first-touch and last-touch views available as alternates. Built the model in dbt, staging CRM touchpoints, normalising channel taxonomy, and joining to Salesforce opportunity data, then surfaced it as a Looker Explore with embedded documentation so each team could self-serve.
Key Findings
Once live, partnerships emerged as contributing to 34% of closed-won ARR through referral touchpoints that had never been captured in the CRM before. Marketing's claimed 60% attribution adjusted to 28% under linear model. The model didn't create conflict. It replaced argument with shared reference. Revenue leaders stopped debating attribution and started discussing what to do with the information.
Sales outbound
38% ARR
Partnerships
34% ARR
Marketing
28% ARR
Recommendation
Delivered the Looker Explore with a 1-hour enablement session for revenue leads. Ad-hoc attribution requests dropped from 15+ to near-zero within 4 weeks. The partnerships finding directly drove a headcount decision: the team was approved to hire a second partnerships manager based on data showing their outsized but previously uncaptured ARR contribution. This is what good analytics looks like: not just answering the question, but changing the decision.
dbt multi-touch revenue attribution: linear, first-touch, and last-touch in one model
dbt · SQL · Snowflake
-- models/revenue/fct_revenue_attribution.sql
-- {{ config(materialized='table', tags=['revenue','weekly']) }}

WITH normalised_touches AS (
  -- Standardise channel taxonomy before attribution
  SELECT
    t.opportunity_id,
    t.touched_at,
    CASE
      WHEN t.source ILIKE '%linkedin%'         THEN 'marketing'
      WHEN t.source ILIKE '%outbound%'         THEN 'sales'
      WHEN t.source ILIKE '%referral%'         THEN 'partnerships'
      ELSE 'other'
    END                                         AS channel
  FROM {{ ref('stg_crm_touchpoints') }} t
  WHERE t.opportunity_id IS NOT NULL
),
touch_windows AS (
  SELECT
    opportunity_id,
    channel,
    touched_at,
    ROW_NUMBER() OVER
      (PARTITION BY opportunity_id
       ORDER BY     touched_at)                AS touch_seq,
    COUNT(*) OVER
      (PARTITION BY opportunity_id)            AS total_touches
  FROM normalised_touches
),
won_opps AS (
  SELECT opportunity_id, arr, close_date,
         owner_team, contract_months
  FROM  {{ ref('stg_opportunities') }}
  WHERE stage = 'Closed Won'
    AND arr   > 0
)
SELECT
  tw.opportunity_id,
  tw.channel,
  o.close_date,
  o.arr,
  o.contract_months,

  -- Linear: equal share to every touchpoint
  o.arr / tw.total_touches                   AS linear_arr,

  -- First-touch: full credit to the opening touchpoint
  CASE WHEN tw.touch_seq = 1
    THEN o.arr ELSE 0 END                    AS first_touch_arr,

  -- Last-touch: full credit to the closing touchpoint
  CASE WHEN tw.touch_seq = tw.total_touches
    THEN o.arr ELSE 0 END                    AS last_touch_arr,

  -- Time-decay: more recent touches get higher weight
  o.arr * (tw.touch_seq / (
    tw.total_touches * (tw.total_touches + 1) / 2.0))
                                              AS time_decay_arr

FROM  touch_windows tw
JOIN  won_opps o USING (opportunity_id)
Technical stack
Tools & methods
The full toolkit across analytics, engineering, and visualisation.
Analytics & BI
LookerLooker StudioMetabaseDeepnoteSnowflake
Programming
SQL (Advanced)PythonRSPSSpandasscikit-learn
Data Engineering
dbtEstuary FlowRudderstackBuildkite
Analytical methods
Cohort analysisLogistic regressionAnomaly detectionSTL decompositionA/B testingMulti-touch attribution
Business tools
Excel (advanced)SAP
Certifications
Advanced SQL · KagglePython · KaggleData Viz · Kaggle
Background
About me

I'm a Data Analyst based in South Jakarta, Indonesia, with four years embedded in logistics and SaaS analytics across Southeast Asia and Australia. My background in Industrial Engineering shaped how I think about problems. Every dataset is a system, and the real work is understanding the feedback loops before reaching for a model.

The case studies in this portfolio reflect how I work in practice: start with a clear business question, validate assumptions before building, keep methodology explainable to stakeholders who don't speak SQL, and always end with a recommendation someone can act on. Data that doesn't change a decision isn't worth generating.

I'm open to Data Analyst and Analytics roles where data drives product and commercial decisions, not just answers ad-hoc requests.

Jul 2024 – Present
Data Analyst
Shippit · Australia (Remote)
Sep 2022 – Jul 2024
Strategy Analyst
Luwjistik · Singapore (Remote)
Aug – Sep 2022
Account Operations Intern
ORBIZ · Jakarta
2018 – 2022
Industrial Engineering
Universitas Katolik Parahyangan
Open to new opportunities

Open to full-time roles across a range of industries and locations.