Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Endigest AI Core Summary
Pinterest Search presents a methodology for scaling search relevance assessment using fine-tuned LLMs to replace costly human annotation.
•A cross-encoder architecture fine-tunes open-source multilingual LLMs (XLM-RoBERTa-large selected for balance of quality/speed) on a 5-level relevance scale using human-annotated data
•Pin representation combines titles, descriptions, BLIP image captions, board titles, and engaged query tokens as textual features
•Stratified query sampling design replaces simple random sampling, using a query-to-interest model and popularity segments to define strata
•LLM labeling reduced Minimum Detectable Effects (MDE) from 1.3–1.5% down to ≤0.25%, primarily through variance reduction via stratification
•
XLM-RoBERTa-large labels 150,000 rows in 30 minutes on a single A10G GPU; LLM labels achieve 73.7% exact match and 91.7% within-1-point agreement with human annotators
This summary was automatically generated by AI based on the original article and may not be fully accurate.