Case Study / Architecture & Construction

AI-Powered Building Intelligence

How we combined YOLO, SAM, and CLIP to build a computer vision platform that analyses architectural floorplans in seconds, not hours.

The Challenge

Manual review was the bottleneck

Architectural compliance checking is one of the most time-consuming stages in building design. Every floorplan needs to be reviewed against hundreds of regulations covering fire safety, accessibility, structural clearances, and spatial requirements.

For Arcus, this meant teams of reviewers spending hours on each plan, manually measuring corridors, counting fire exits, and cross-referencing room layouts against building codes. Errors were common. Feedback cycles were slow. Projects stalled waiting for sign-off.

They needed a system that could read a floorplan the way an experienced architect does, but at machine speed and with consistent accuracy.

The Solution

A vision pipeline built for precision

We designed and built a multi-stage computer vision pipeline that combines three specialist AI models, each handling the part of the problem it does best.

YOLO handles fast, accurate object detection across 40+ element types. SAM delivers pixel-precise segmentation for room boundaries and spatial analysis. CLIP provides zero-shot classification, understanding what each space is without needing labelled examples for every room type.

The result is a system that ingests a floorplan, identifies every structural element, understands the spatial layout, and checks compliance against building regulations - all in under eight seconds.

Architecture

The analysis pipeline

Five stages transform a raw floorplan into structured compliance data. Each stage builds on the previous, creating a progressively richer understanding of the document.

01Ingest

Document intake

Floorplans, CAD exports, and scanned drawings are normalised into a consistent format. OpenCV handles skew correction, noise reduction, and adaptive thresholding to produce clean binary images ready for detection.

02Detect

Object detection with YOLO

A fine-tuned YOLOv8 model identifies structural elements - walls, doors, windows, columns, staircases, and fire exits - in real time. The model was trained on thousands of annotated floorplans across residential, commercial, and industrial building types.

03Segment

Precision segmentation with SAM

Meta's Segment Anything Model extracts pixel-precise boundaries for each detected element. Room boundaries, corridors, and open-plan areas are isolated with sub-pixel accuracy, enabling reliable area calculations and spatial reasoning.

04Classify

Semantic understanding with CLIP

CLIP's zero-shot classification identifies room types, labels, and annotations without manual tagging. By matching visual regions against natural language descriptions, the system understands context: distinguishing a kitchen from a bathroom, or a fire escape from a standard exit.

05Validate

Compliance engine

Detected elements and spatial relationships are evaluated against building regulations. Corridor widths, fire exit distances, accessibility clearances, and room proportions are checked automatically, with violations flagged and localised on the original plan.

Before & After

From manual to machine-assisted

Before Arcus AI
  • Manual measurement of every corridor, doorway, and room
  • Hours per floorplan for a single compliance review
  • Inconsistent results depending on the reviewer
  • Feedback delays slowing down project timelines
  • Errors caught late, often during construction
  • No structured data output for downstream systems
After Arcus AI
  • Automated detection of 40+ element types in seconds
  • Sub-8-second analysis time per floorplan
  • Consistent, repeatable results on every review
  • Instant feedback with violations localised on the plan
  • Issues caught at design stage, before construction begins
  • Structured JSON output feeding BIM and project management tools
Technology

The stack behind the intelligence

Each tool was chosen for a specific role in the pipeline. No bloat, no unnecessary abstraction - just the right model for each job.

YOLOv8Object Detection

Real-time detection of 40+ architectural element classes. Optimised for floorplan line art with custom anchor ratios and augmentation strategies tuned for technical drawings.

SAMInstance Segmentation

Pixel-precise boundary extraction for rooms and structural elements. Prompt-based segmentation allows operators to refine results interactively when edge cases arise.

CLIPZero-Shot Classification

Natural language-driven classification eliminates the need for labelled training data when new room types or annotations appear. Handles multilingual plans without retraining.

OpenCVImage Processing

Pre-processing pipeline for document normalisation, adaptive thresholding, contour detection, and geometric measurement. The backbone that ensures consistent input quality across scan types.

PyTorchTraining Infrastructure

Custom training loops with mixed-precision training and distributed data parallel for efficient fine-tuning. Model versioning and A/B evaluation pipelines for continuous improvement.

FastAPIAPI Layer

Async Python API serving model predictions with sub-second latency. Batch processing endpoints for bulk plan analysis and webhook-based notifications for long-running jobs.

95%

Detection accuracy across element types

< 8s

Average analysis time per floorplan

40+

Architectural element classes recognised

73%

Reduction in manual review time

Results

Proof that AI, when trained and tuned properly, can transform an industry

"We went from spending half a day on a single compliance review to getting results in seconds. The system catches things that even experienced reviewers miss, and the structured output feeds directly into our BIM workflows."

Arcus Engineering Team
Approach

What made this work

Specialist models, not one-size-fits-all

Rather than forcing a single model to handle detection, segmentation, and classification, we used three purpose-built models and orchestrated them into a pipeline. Each model does what it does best.

Domain-specific training data

Off-the-shelf models struggle with technical drawings. We built custom training datasets from real architectural plans, annotated by people who understand building design - not generic crowdsourced labelling.

Human-in-the-loop refinement

SAM's prompt-based interface means operators can correct edge cases interactively. Those corrections feed back into the training pipeline, so the system improves with every plan it processes.

Outcome

Safer, smarter design decisions

Arcus now processes hundreds of floorplans per week with consistent accuracy. Compliance issues that used to surface during construction are caught at the design stage, saving time, cost, and risk.

The structured output integrates with BIM platforms and project management tools, giving architects and safety teams a shared, data-driven view of every project.

Most importantly, the system keeps learning. Every plan it processes, every correction an operator makes, feeds back into the training pipeline. The models get better with use, not worse.

Ready to put AI to work on your toughest problems?

Whether it is computer vision, automation, or a challenge we have not seen yet, we will find the right approach.