What AI Engineering Looks Like in Practice

Four representative engagements across healthcare, manufacturing, finance, and defence. Client names are withheld where contracts require it. The work is real.

Healthcare Python · PyTorch · scikit-learn · peer-reviewed validation

Predicting Disease Onset From Behavioural Data

Problem

A clinical research group needed to flag children at risk of Autism Spectrum Disorder months earlier than standard paediatric screening allowed. Routine screening relies on the Q-Chat-10 questionnaire interpreted by trained clinicians, but specialist capacity was limited and assessment waiting lists ran 3 to 6 months. Many at-risk children passed the early-intervention window before they reached a diagnosis.

Approach

We built a deep-learning classifier that ingests Q-Chat-10 responses and outputs a calibrated probability of ASD diagnosis, benchmarked against the same expert-clinician ground truth used in routine practice. Multiple architectures (an MLP, gradient-boosted trees, an attention-based model) were compared on held-out validation cohorts; the winner was packaged behind a clinician-facing tool with a confidence interval on every prediction and a one-click "I disagree" feedback path that flowed back into the training pipeline.

Outcome

Earlier triage of at-risk children, validated against expert diagnoses and published in a peer-reviewed venue (DOI: 10.61643/c478960). The model is not a substitute for clinical diagnosis; it is a triage tool that gets the right children in front of a specialist sooner.

Manufacturing Time-series fusion · self-supervised pre-training · CMMS integration

Predictive Maintenance on Legacy Equipment

Problem

A mid-size manufacturer was losing six-figure sums per quarter to unplanned downtime on production lines instrumented with patchwork sensor coverage that had grown over the years. Different machines reported to different historians, the data formats varied by vendor, and the maintenance team operated reactively: a fault occurred, then a technician was paged. Each unplanned stop disrupted upstream and downstream work centres.

Approach

We fused vibration, temperature, motor-current, and acoustic streams from heterogeneous sensors into a unified time-series store, then trained per-asset failure-prediction models on the resulting feature set. Where labels were sparse (true faults are rare events) we used self-supervised pre-training on healthy-operation data and fine-tuned on the small set of labelled failure windows the historian had captured. Predictions surfaced in a Slack channel and inside the maintenance team's existing CMMS, hours before the failure event.

Outcome

Multiple hours of warning before in-progress faults, long enough to fold maintenance into a planned changeover instead of paging a technician at 2 a.m. The team transitioned from reactive to predictive maintenance across the instrumented asset set; the next phase extends coverage to assets still on patchwork sensing.

Finance Gradient-boosted trees · Kafka · SHAP explanations · sub-100ms p99

Real-Time Fraud Detection at Transaction Scale

Problem

A payments operator processing high-volume card-not-present transactions needed sub-100ms risk decisions on every transaction, plus a per-decision explanation that would satisfy a regulator and a chargeback dispute. Their existing rules engine had become a brittle accretion of patches, and false positives were eating into legitimate revenue.

Approach

We deployed a gradient-boosted tree model behind a low-latency streaming pipeline (Kafka into a Flink-flavoured inference service, sub-50ms p99 at peak load). Every decision is paired with a SHAP-based explanation showing the top contributing features, retained in an immutable audit log for the regulatory review trail. A kill-switch in the operator dashboard reverts to the prior rules engine in seconds if model behaviour ever looks wrong. Drift monitoring on input features and on decision distributions runs continuously.

Outcome

Sub-100ms decision latency at peak load with explainability per decision and a documented rollback path. The compliance team gets the audit log they need without operators writing one-off queries against production logs.

Defence Multi-source NLP · semantic clustering · analyst-feedback learning

Open-Source Threat Intelligence at Scale

Problem

A defence customer was drowning in open-source signal: news wires, social posts, technical reports, leaked dump sites, far beyond what an analyst team could read in a day. The decisions that mattered were buried in a torrent of routine noise, and analysts were spending the day on triage instead of analysis.

Approach

We built a multi-source NLP pipeline that ingests open-source feeds, deduplicates near-identical content, classifies by topic and threat type, and clusters semantically related items. A relevance ranking model trained on analyst feedback surfaces the handful of items that fit the customer's standing intelligence requirements; everything else flows into a searchable archive. Every surfaced item links back to its source document with the path the ranking decision took, so analysts can audit and correct the model.

Outcome

Analyst hours redirected from triage to analysis. The pipeline now ships routinely-curated daily briefings while the analyst team refines the standing intelligence requirements and the ranking model retrains on their feedback.

Have a Decision Worth Automating?

Tell us the outcome you're after, not the AI you think you need. We'll come back with a clear path or a clear “no, this isn't an AI problem.” Either is more useful than a sales pitch.