rfguri.
AI · Marketplace · Consumer · Jun. 2018 → Aug. 2020

In-house learning to rank that lifted marketplace conversion.

In-house learning-to-rank lifted conversion 3.02%, then led App Experience and launched a secondhand-car vertical.

Letgo
Role
Product Lead, Search and Discovery
Team
Cross-functional · reports to CPO
Timeframe
Jun 2018 to Aug 2020
Stack
Python · TensorFlow · Elasticsearch

01 · The problem

What Letgo actually needed.

Letgo was a high-volume consumer secondhand marketplace where the entire economy ran through one surface: search and the feed. When a buyer couldn't surface the right item fast, the listing didn't sell, the seller didn't come back, and liquidity decayed on both sides. The ranking that decided what millions of buyers saw was rules-driven and recency-weighted. It ordered listings but never learned from what people actually clicked, messaged, and bought. At marketplace scale, even a one-percent relevance gap leaves a large amount of GMV on the table.

02 · Context and insight

The reframe that set the direction.

I joined to lead Search and Discovery, focused on the AI/ML side: information retrieval and classification. The framing insight: relevance is a learnable function, not a static rule set. A secondhand marketplace generates enormous, honest signal, where every click, chat, and sale tells you whether a result deserved its position. We were ranking with heuristics while sitting on the exact behavioral data that should have trained the ranker. Six months in, the work earned a promotion to Product Lead for the App Experience tribe, where the same evidence-first instinct shaped the end-to-end buyer and seller journey and, post-acquisition, the launch of a new car vertical.

03, The approach

The decisions that mattered.

Treat relevance as a learnable function, and build it in-house

Instead of buying an off-the-shelf ranking service, we built a learning-to-rank system in-house. The tradeoff was deliberate: more upfront engineering cost in exchange for a ranker trained on our own behavioral signal, full control of the feature pipeline, and the ability to iterate on relevance as a product surface rather than a vendor black box. For a marketplace where ranking is the product, owning that loop was worth the build.

NDCG as the offline gate, live conversion as the only verdict

We anchored on NDCG to evaluate model candidates against logged behavior before shipping. A fast, cheap inner loop where a candidate had to clear the bar offline before earning an A/B slot. NDCG climbed 48%, the leading indicator. The business decision lived in controlled experiments on live traffic, held to a marketplace-wide conversion bar. That is why the 3.02% lift is defensible: measured against control, not a cherry-picked cohort.

Launch a new vertical without breaking the core marketplace

Post-acquisition, the strategic bet was a secondhand-car vertical. Cars aren't phones: higher consideration, different trust requirements, different discovery needs. I managed the reuse-vs-rebuild tradeoff, leaning on existing marketplace, chat, and ranking infrastructure where it transferred while building the car-specific pieces the vertical demanded. The goal: a high-value new vertical that strengthened the consumer experience rather than diluting it.

04 · How it's built

Close to the stack, not above it.

Build-vs-buy was the load-bearing decision. We built the learning-to-rank stack in-house rather than adopting a managed product, so relevance could be iterated as a product instead of gated behind a vendor roadmap. I led the product side: the relevance objective, the NDCG eval gate, the experiment design, and the rollout bar, partnering with ML and platform engineering. On the vertical launch, the central tradeoff was maximizing leverage from existing infrastructure and building net-new only where cars genuinely differed.

Impact
+3.02%
Marketplace conversion lift vs control
+48%
NDCG relevance gain from LTR
100M+
App downloads across US classifieds

The Search and Discovery work proved a thesis cleanly: a marketplace ranking itself with its own behavioral data beats one ranked by rules, and you can prove it offline before risking live traffic. A 48% relevance gain converting into a 3.02% marketplace-wide conversion lift compounds every session, every day. That credibility moved me from owning one surface to leading the App Experience tribe and, after acquisition, standing up the company's bet on a new high-value vertical.

What I’d carry forward

Offline metrics earn you the right to experiment, not the win. NDCG told us where to look, but only the A/B result let us claim it, and keeping those two honest against each other is the whole discipline. Building LTR in-house was right for a marketplace where ranking is the product, but it's a real cost I'd only pay again when the ranking loop is core. Launching cars taught the limit of reuse: shared infrastructure accelerates a vertical right up until its trust and discovery needs diverge.