How to 3D Scan a Heritage Building: Our Field-to-Web Workflow
How to 3D scan a heritage building: a step-by-step workflow covering site survey, lighting, capture pattern, scale references, hard surfaces, processing and web delivery.
To 3D scan a heritage building, you survey the site first, control your light, shoot a dense overlapping capture pattern with physical scale references in frame, treat reflective and dark surfaces as special cases, then process into a textured mesh or Gaussian splat and optimise it hard for the web. The capture is the easy part. Everything before and after is where projects succeed or quietly fail.
We have done this on real sites — the Convento do Beato in Lisbon and the Castelo de Castro Marim in the Algarve — and the lessons below are the ones we paid for with re-shoots and reprocessing, not the ones you read in a marketing brochure.
How to 3D scan a heritage building, end to end
The short version, in the order you actually do it:
- Site survey — walk it, photograph it, find the problems before you commit a capture day.
- Lighting — get it flat and consistent, or schedule around the sun.
- Capture pattern — high overlap, multiple elevations, no gaps.
- Scale references — physical markers so the model is metrically true.
- Hard surfaces — a deliberate plan for glass, gold, stone and shadow.
- Processing — alignment, mesh or splat, texture bake, cleanup.
- Web delivery — decimate, compress, stream, and make it load on a phone.
Two terms we use throughout. Photogrammetry reconstructs 3D geometry from many overlapping 2D photographs by matching features across frames. A Gaussian splat is a newer radiance-field representation that stores the scene as millions of coloured, view-dependent ellipsoids rather than a triangle mesh — it captures reflective and intricate detail beautifully but is heavier and harder to edit. We use both, and we choose per surface, not per project.
1. Site survey: find the failures before capture day
The most expensive mistake in heritage capture is discovering a problem on a tripod, on the clock, with a client watching. The survey exists to move every surprise earlier and cheaper.
We walk the entire structure with a phone and a notebook and record four things: scale (do we need a drone, a pole, scaffolding?), light (where does direct sun fall and at what hour?), access (locked rooms, fragile floors, visitor schedules), and problem surfaces (glass, gilding, polished stone, deep shadow). At the Convento do Beato — a former monastery with enormous interior volumes — the survey told us a single handheld pass would never resolve the upper vaults, so we planned elevated capture from the start instead of improvising it badly later.
Heritage adds a constraint software cannot fix: you usually cannot touch anything. No markers screwed into walls, no lights clamped to carvings, sometimes no tripod legs on a protected floor. The survey is where you negotiate what is allowed with the site custodian and design the capture around those rules.
2. Lighting: flat, consistent, or scheduled
Photogrammetry hates two things: moving shadows and baked-in highlights. If the lighting changes between photos, feature matching degrades and your texture ends up with shadows fused into the stone that no amount of grading will remove.
Indoors, we aim for flat, even light and shoot at a fixed white balance and exposure — manual, never auto. Outdoors on a castle like Castro Marim, the ramparts are fully exposed to a hard Algarve sun, so we treat the sun as a scheduling problem, not a lighting one. Overcast days are a gift; a bright midday sun is the enemy. We favour the diffuse light of early morning or late afternoon and an overcast sky when we can get one, and we avoid capturing the same surface as the shadow line crosses it.
Practical rules we hold to:
- Shoot RAW, fixed manual exposure, fixed white balance, low ISO.
- Lock focus per zone — autofocus hunting between frames wrecks alignment.
- If you must add light, light the whole space evenly; never spotlight one wall.
- Capture a colour and grey card once per lighting condition for honest texture colour.
3. Capture pattern: overlap, elevations, no gaps
This is the part everyone thinks is the whole job. The rule that matters most: every point on the surface must appear in enough photos, from enough angles, with enough overlap.
We aim for roughly 70–80% overlap between adjacent frames, and we shoot each region from at least three elevations — low, eye-level, high — so geometry is triangulated rather than guessed. For a façade we move in a grid: orbit horizontally at one height, step up, orbit again. For interiors we treat each room as its own capture and deliberately overlap the doorways so the rooms align into one coherent model during processing.
Corners, recesses and undercuts are where gaps hide. Carved capitals, vaulting ribs, machicolations on the castle walls — these need extra frames taken specifically to see into the geometry, not just across it. We keep a mental checklist: convex surfaces are easy, concave surfaces need attention, and anything you can stick a finger behind needs three times the coverage.
For broad context we combine ground photogrammetry with elevated and aerial passes. The handheld camera resolves detail; the higher passes give the roofs, the tower tops and the overall massing that you physically cannot reach from the ground.
4. Scale references: make the model metrically true
A photogrammetry reconstruction is scaleless by default — the software knows the shape but not the size. Without a reference, your castle could be a dollhouse. Scale references turn a pretty model into a measurable record.
The clean solution is coded targets or a scale bar of known length placed in the scene. On heritage sites where we cannot place anything intrusive, we fall back to laying a calibrated bar or a measured baton on the floor, photographing it in several frames, and removing it in cleanup. We also take a handful of real tape measurements between fixed, permanent points — a doorway width, a step rise — and verify the finished model against them. If the model says the doorway is 1.04 m and the tape says 0.98 m, the whole reconstruction is 6% off and we rescale before anything else.
5. Reflective and dark surfaces: the special cases
Photogrammetry assumes surfaces look the same from every angle. Anything shiny, transparent, or featureless breaks that assumption. Heritage buildings are full of all three: leaded glass, gilded altarpieces, polished marble, and the deep matte black of unlit stone.
Our playbook by surface type:
- Glass and mirrors — photogrammetry will fail outright; the reflection moves as you move. We either mask it out and model clear glass as a clean plane in post, or we capture that element as a Gaussian splat, which represents view-dependent appearance natively and keeps the reflections believable.
- Gold and polished metal — shoot under the flattest light possible, cross-polarise if we can, and accept that the most specular highlights may need hand-correction in the texture.
- Dark stone and shadow — these starve the algorithm of features. We lift exposure for those passes, bracket, and add diffuse fill so the matcher has texture to grab. Underexposed black surfaces are the single most common reason an interior fails to align.
- Repetitive patterns — identical tiles or balusters confuse feature matching because every frame looks the same. We rely on the wider geometric context and overlap to disambiguate them.
The honest tradeoff: a mesh is editable, lightweight and easy to put on the web; a splat is gorgeous on reflective detail but heavier and harder to clean. On reflective-heavy interiors we increasingly capture both and composite — splat where the eye lingers on shine, mesh everywhere else.
6. Processing: alignment, mesh or splat, texture, cleanup
Processing is where a good capture is honoured or wasted. The pipeline, in order:
- Alignment — the software matches features across all photos and solves camera positions. If photos do not align, you have a capture problem, not a software problem; we go back to the frames rather than fight the solver.
- Scale — apply the scale bar or known measurements before anything downstream.
- Dense reconstruction — build the point cloud, then a triangle mesh (for the photogrammetry path) or train the splat (for the radiance-field path).
- Texture bake — project the photos back onto the mesh to produce the colour, ideally with shadows removed so lighting reads as neutral.
- Cleanup — delete floating artefacts, fill small holes, remove the scale bar and any people or equipment, and retopologise the worst density spikes.
We process in stages and review at each one. A two-hour alignment that produces garbage is cheaper to catch than a twelve-hour mesh build on bad data. Keep the raw photos forever — reprocessing with a better algorithm in two years is free; re-shooting a monastery is not.
7. Web delivery: make it load on a phone
A 40-million-triangle mesh is a beautiful archival asset and a terrible web experience. Web delivery is an exercise in throwing away detail intelligently. Our defaults:
- Decimate the mesh to a budget — typically a few hundred thousand triangles for a hero model — and bake the lost detail into a normal map so the silhouette stays crisp.
- Compress geometry with Draco and ship as glTF/GLB; compress textures with KTX2/Basis so the GPU uploads them without a CPU stall.
- Stream and level-of-detail so distant geometry loads coarse and refines as the visitor approaches.
- Render with Three.js / WebGL, server-rendered page around it, no heavy framework — consistent with how we build the rest of the site.
- Test on a mid-range phone on cellular, not your workstation. If it does not load in a few seconds and hold a usable frame rate, it is not finished.
Gaussian splats follow the same logic with different tools: prune low-contribution splats, quantise, and stream by importance. The principle is identical — the model that ships is the one that loads.
The short version
Survey to find failures early. Control light or schedule around it. Shoot high-overlap, multi-elevation patterns with no gaps. Put a scale reference in frame. Plan glass, gold, dark stone and reflections as special cases — splat where mesh fails. Process in reviewable stages and keep every raw photo. Then decimate, compress and stream so it loads on a phone.
Heritage capture rewards patience and punishes shortcuts. The Convento do Beato and Castelo de Castro Marim taught us that the best capture day is a boring one — because every interesting problem was already solved on the survey. If you are weighing a 3D capture of a building that matters, that is the standard worth holding to.