Client Success Story

Retail Image Annotation: Helping a Competitive Intelligence Firm Deliver Faster, More Reliable Insights

250K+

Annotations Delivered Monthly

98.5%

Annotation Accuracy

Service

  • Image Annotation
  • Data Annotation

Platform

  • Client’s Proprietary Data Annotation Tool
THE CLIENT

A Leading US-Based Competitive Intelligence Company

The client is a well-established competitive intelligence firm based in the United States. They specialize in tracking and analyzing competitors' direct marketing campaigns across email, social media, and online/offline promotions. Their services span multiple sectors, including banking, insurance, retail, credit cards, energy, and telecom — helping brands understand what their competitors are doing and how to respond.

PROJECT REQUIREMENTS

Annotating Retail Promotions at Scale to Calculate Promotional Value

Following a successful data management project, the client reached out to SunTec India for additional support to manage their annotation workflow, to label a large volume of retail promotional content at a consistent monthly scale. Their core objective was to identify, annotate, and categorize different retail-related categories, such as Entertainment, Food Services, Health & Beauty, Clothing & Accessories, and others, within PDF documents derived from HTML-based email campaigns.

Each identified category had to be marked with bounding boxes using the client's proprietary data annotation tool. In addition to the bounding box, annotators were required to capture additional metadata for each marked element, including the promotional value, brand name, and parent company name. This metadata was fed directly into the client's system to calculate the Relative Promotional Value (RPV) — a metric based on the proportion of the page area each advertisement occupies within the PDF.

In essence, the project required a combination of image annotation services, structured metadata tagging, and brand-entity attribution, with a monthly volume target of 2,50,000 annotations.

PROJECT CHALLENGES

High Volumes, Diverse Retail Categories, and Metadata Precision at Scale

At first glance, the project appeared to be a straightforward PDF labeling task — draw bounding boxes, tag metadata, and deliver. However, the combination of scale, source inconsistency, and the analytical weight the client placed on annotation accuracy made this far more demanding than a typical image labeling engagement. Every annotation directly influenced the client's RPV calculations, which meant even small errors could distort the competitive intelligence their end customers relied on.

  • Category Ambiguity in Multi-Product Advertisements

    Promotional intent is subjective and context-dependent, unlike object detection, where a car is a car. A single ad block might feature a wellness brand promoting beauty products alongside health supplements, or a department store bundling clothing with home décor. Standard annotation guidelines couldn't account for every overlap. Annotators had to interpret the dominant promotional intent of each block and classify it accurately—a task that required understanding of the retail domain, not just image labeling proficiency.
  • Inconsistent Source Quality

    The PDFs were converted from HTML email files received from various sources. This resulted in wildly inconsistent layouts, broken rendering, overlapping content blocks, and varying resolutions across batches. An annotator couldn't build muscle memory around a predictable format because the format changed with almost every new source. This forced the team to treat each batch as a fresh visual puzzle, significantly slowing initial processing speeds until adaptive workflows were established.
  • Brand-Entity Attribution for Emerging and Regional Brands

    While identifying well-known global brands was straightforward, a significant portion of the retail promotions featured regional chains, newer DTC brands, or sub-brands whose parent company relationships were not immediately obvious. Incorrectly attributing a brand to the wrong parent entity would corrupt the client's competitive benchmarking data. This required annotators to verify relationships through external research, adding a brand-entity validation layer that most annotation projects do not require.
OUR SOLUTION

A Dedicated Data Labeling Team Delivering Retail Annotation at Scale

We deployed a dedicated team of 23 resources, including image annotation specialists, quality analysts, a domain SME (subject matter expert), and a project manager, all trained on the client's data annotation tool. Our approach combined visual annotation with structured quality control workflows to deliver consistent, high-quality outputs month over month.

1

Tool Training and Onboarding

Our annotators were trained on the client's image annotation platform during an intensive onboarding phase. This included hands-on sessions covering bounding box placement conventions, metadata tagging fields, category classification rules, and the RPV calculation logic. A detailed annotation guideline document was co-developed with the client to serve as the team's operational reference throughout the project.

2

Adaptive Processing for Inconsistent Source Formats

At the start of each new batch, a senior annotator assessed the incoming files for layout patterns, rendering issues, content block overlap, and resolution quality. Based on this assessment, the team received a batch-level format briefing that flagged specific quirks to watch for, such as ad blocks bleeding into adjacent content, broken image rendering, or non-standard page structures. We also created visual reference sheets for recurring formats that new team members could consult during onboarding and existing annotators could refer to when labeling unfamiliar batch formats.

3

Category Identification and Bounding Box Annotation

Our annotators processed each PDF document systematically, scanning the layout, identifying distinct promotional blocks, and drawing precise bounding boxes around each one. Each bounding box was classified into the correct retail category (Entertainment, Food Services, Health & Beauty, Clothing & Accessories, etc.) based on visual and textual cues in the content.

4

Metadata Tagging and Brand-Entity Attribution

Once the bounding boxes were in place, annotators enriched each annotation with the required metadata — including brand name, parent company, and promotional value. We also maintained and regularly updated an internal reference database of retail brands and their parent companies to ensure consistency. When new or unfamiliar brands appeared in the promotional content, annotators performed targeted web research to verify brand-parent relationships before tagging, reducing the risk of misattribution.

5

Escalating Ambiguous Cases to SMEs

For ambiguous advertisements that straddled multiple categories, we established a formal escalation protocol. Senior annotators and subject matter experts reviewed edge cases and made classification decisions based on predefined rules agreed upon with the client.

Over time, these escalated decisions were documented and fed back into the annotation guidelines, building an expanding library of precedents that reduced ambiguity for the broader team and improved classification speed on similar cases in future batches.

6

Multi-Tier Quality Assurance

We implemented a three-tier quality assurance process to maintain annotation accuracy at scale:

  • Each annotator reviewed their own work for obvious errors before submission.
  • A random sample of annotations from each batch was reviewed by a peer annotator for consistency in category classification and metadata accuracy.
  • Senior QA analysts conducted a final review of each delivery batch, checking for bounding box precision, metadata completeness, and adherence to the client's annotation guidelines.
7

Iterative Feedback and Process Refinement

We maintained a continuous feedback loop with the client. Weekly calibration calls addressed evolving classification rules, new retail categories, and changes to the client's data requirements. This iterative process allowed us to refine our annotation criteria over time and stay aligned with the client's analytics objectives, which was critical for a project spanning multiple years.

How We Maintained Annotation Consistency across 23 Annotators for 3+ Years

In any long-running data annotation project with a large team, interpretation drift is inevitable. Over months and years, different annotators gradually develop slightly different classification habits that erode consistency if left unchecked.

We engineered three interlocking safeguards to prevent this:

  • QA as a Drift Detector: The multi-tier review process didn't just catch errors. It surfaced patterns where individual annotators were trending away from the team's classification baseline.
  • Reference Library as a Drift Corrector: Every escalated edge-case decision was documented and fed back into the guidelines, giving the entire team a shared, evolving reference point rather than relying on individual judgment.
  • Calibration Calls as a Drift Preventer: Weekly sessions with the client ensured that classification standards evolved deliberately and were communicated to the full team simultaneously.

This governance layer is what made it possible to sustain 98.5% accuracy not just in month one, but across 36+ consecutive months of delivery.

PROJECT OUTCOMES

The structured approach to retail annotation and data labeling delivered measurable improvements to the client's promotional analytics pipeline, and the project ran for a little over 3 years.

250K+ Annotations Delivered Monthly With no deadline creeps or delays

98.5% Annotation Accuracy Maintained throughout the project

50% Shorter Report Lifecycle Improved time-to-insight for end customers

The team adapted and consistently delivered month after month. This became a long-term partnership because the results were so consistent.

- Project Lead

CONTACT US

Scale Your Data Annotation Pipeline with SunTec India

Whether you need image annotation services for object detection, metadata tagging for competitive intelligence, or specialized retail annotation services (like video annotation for in-store customer support), our team delivers precision at scale. With ISO-certified data security practices and 25+ years of domain experience, we help businesses build smarter AI models with high-quality training data. Additionally, we offer a broad range of AI training data services to help enterprise AI initiatives succeed. Request a free consultation to learn how we can support your annotation needs.