The client is a technology company that helps businesses digitize and manage physical assets, including communication towers, wind turbines, pipelines, storage tanks, and other industrial structures. Their platform utilizes data capture technologies (such as drones, cameras, sensors, 3D scanning tools) and artificial intelligence (AI) to create digital replicas of real-world assets, enabling easier inspection, monitoring, and maintenance. This approach enables organizations to enhance safety, minimize maintenance costs, and make informed, data-driven decisions about their infrastructure.
The client required detailed pixel-level image segmentation to identify and label visible corrosion on telecommunication towers and their components. The annotated data was intended to train an AI-based corrosion detection system, which would ultimately support infrastructure management.
Our image annotation team had to label all rust-affected areas on the towers, including the main structure, supporting elements, and base attachments, while excluding any background objects or irrelevant surfaces. All types of corrosion were to be treated equally, with a preference for slightly over-labeling rather than missing small patches.
The output needed to be highly precise and consistent, ensuring even minor rust spots were captured clearly to support the client’s corrosion detection model and AI-based infrastructure assessment models.
The project involved several technical and visual challenges that required careful handling to maintain accuracy and consistency across all image annotations.
We approached the challenge by building a foundation of clarity and control — defining exactly what corrosion looks like, using the right tools to capture it with pixel-level precision, and validating every labeled image through a series of quality checks. The result was a dataset that the client could trust, i.e., consistent, verifiable, and suitable for training an AI model.
Subjectivity was a significant challenge in this image labeling project. Corrosion appeared differently across tower materials (steel, galvanized metal) and environmental conditions (sunlight, shade, moisture). Additionally, what appeared to be rust to one annotator might have been perceived as dirt, paint wear, or shadow by another.
To eliminate this ambiguity and ensure a uniform understanding across the team, we developed a guide as a single, reliable reference for identifying corrosion under various conditions.
The guide contained:
This reference enabled the faster onboarding of new team members without compromising quality, simplified reviews for QA personnel, and ultimately improved annotation consistency.
The client’s AI corrosion detection system required extremely detailed training data. Every corroded section of the tower had to be captured with high spatial accuracy across hundreds of images. Even minor boundary errors — such as under-labeling or missing faint rust — could affect model performance.
We used Label Studio, an open-source data labeling platform. Although CVAT also supports pixel-level segmentation, Label Studio was selected for its greater flexibility and collaboration features. Its customizable labeling templates, real-time feedback tools, and user-friendly interface allowed multiple annotators and reviewers to work efficiently while maintaining consistent labeling standards across the dataset.
Here’s how we proceeded with this data annotation project:
Given the visual complexity of the images and the involvement of multiple annotators, we designed a layered QC process to identify and correct labeling inconsistencies before final delivery.
By the final review, all corrosion labels were consistent and highly accurate, making the dataset ready for use. Additionally, each QC finding was logged in a shared feedback tracker. Corrections and examples of common errors were discussed during daily review sessions, allowing the team to continuously improve labeling accuracy throughout the project.
Raw image
Annotated image
Indicating highly uniform labeling across the dataset.
Compared to training on previous, non-standardized data.
Verified through a multi-stage QA involving self-review, peer checks, and final audits.
SunTec maintained clear communication, met deadlines, and delivered high-quality labeled data that integrated smoothly with our AI pipeline.
- Director of Data Operations
No two datasets are alike — and that’s exactly why we design data labeling solutions that fit your specific AI use case. From multi-class segmentation to object detection, we help computer vision teams tackle difficult AI training problems into clean, usable training data.
Try a free sample, or reach out to us to know more.