Teaching AI to Read Property Boundaries

Building computational geometry systems that parse historical property descriptions

Andrei Vasilcoi

January 1, 2026

Property legal descriptions are a domain where historical surveying conventions meet modern computational geometry. Building an automated system for extracting and visualising property boundaries from US legal documents presents a unique set of technical challenges that span natural language processing, spatial computing, and geometric reasoning.

The Problem: Multiple Property Description Systems

The US uses various property description systems that evolved over time, reflecting different regional and historical surveying practices. We focused our initial work on the two most prevalent systems, which together account for the majority of legal descriptions in the country:

Public Land Survey System (PLSS): A grid-based system introduced in the 1780s that describes properties using townships, ranges, and section subdivisions. For example: “The NE/4 of the NW/4 of Section 12, Township 3 North, Range 2 East.” While conceptually systematic, PLSS presents computational challenges—sections aren’t geometrically perfect squares, principal meridians vary by region, and historical surveys have inherent precision limitations.

PLSS coordinate system showing townships, ranges, and sections

Metes and Bounds: A narrative description system that traces property boundaries using bearings and distances: “Beginning at the oak tree, thence North 45° 30’ East for 200 feet, thence following a curve to the right with a radius of 50 feet…” This approach requires converting textual survey instructions into precise geometric coordinates.

Metes and bounds description with curves and bearings

Real-world legal descriptions often mix both systems. They might say “Beginning at the SW corner of the NE/4 of Section 5 (PLSS reference), thence N 22° 15’ E for 312.4 feet (metes and bounds), EXCEPT that portion described in Deed Book 42, Page 17…” You can see how a single description becomes a geometric puzzle involving unions, differences, and spatial reasoning.

The Curve Challenge: Beyond Simple Circular Arcs

One of the most technically demanding aspects of this work is curve geometry. Metes and bounds descriptions frequently include curves—property boundaries that follow roads, rivers, or natural terrain features. However, surveyors describe these curves using various geometric specifications, far beyond simple circular arcs.

Survey documents contain non-tangent curves (curves that don’t smoothly connect to the previous segment), compound curves (multiple connected arcs with different radii), and reverse curves (back-to-back curves turning opposite directions). A curve might be specified using:

Radius and arc length
Radius and chord length
Delta angle and chord direction
Tangent directions with radial bearings
Various combinations of the above parameters

Property boundaries often follow roads and natural features, with curves that reflect terrain and historical surveying instrument limitations. This variability requires a robust curve approximation system that can:

Normalise multiple input formats to a standard representation
Approximate complex curves with straight line segments for rendering and geometric operations
Maintain accuracy through careful segment density (we use a 4° maximum segment angle to balance visual quality with computational efficiency)
Support advanced curve specifications including non-tangent connections

The solution involves a normalisation layer that converts all curve specifications to a standard format, followed by approximation with calculated intermediate points. Accuracy here is critical—errors manifest as gaps or distortions in property boundaries that could have legal significance.

Architecture: Graph-Based Orchestration with AI Agents

Legal descriptions require structured geometric output with high accuracy. A pure LLM approach, while promising, introduces reliability challenges—models can misinterpret bearings, drop precision, or hallucinate geometric relationships. Given the legal significance of property boundaries, we needed a more robust architecture.

Our solution uses deterministic graph orchestration with specialised AI agents—combining the flexibility of language models with structured computational geometry.

The Processing Pipeline

1. Decomposition: The system analyses the legal description text and decomposes it into processable components—PLSS references, metes and bounds sections, and exceptions. Importantly, this stage also generates an aggregation plan specifying how to combine results: union certain components, subtract exceptions, and produce a final boundary.

2. Parallel Component Processing: Each component is processed by a specialised agent that runs in isolation with its own conversation state. PLSS agents query our property database and apply geometric transformations. Metes and bounds agents convert bearing-and-distance instructions into coordinates, handling both relative and absolute positioning references.

3. Geometric Aggregation: The system executes the geometric set operations specified in the aggregation plan—unions, differences, and intersections. These operations handle real-world geometric complexities including irregular boundaries and topological edge cases.

4. Validation: Automated validation checks flag geometries that exceed reasonable bounds, contain topological errors, or show other indicators of extraction issues.

Graceful Degradation

A key architectural decision was prioritising partial results over all-or-nothing extraction. If a legal description contains three components and one fails to process—perhaps due to an ambiguous reference or OCR error in the source document—the system returns the successfully extracted components with clear error reporting for the failed portion.

This approach required careful consideration of failure modes and state management. Each component agent maintains isolated execution context, enabling robust error handling and detailed diagnostic information when issues occur.

Coordinate Transformation and Spatial Positioning

Metes and bounds descriptions work in local coordinate systems—they’re instructions for walking around a property boundary, typically starting from an origin point. However, to display these boundaries on modern web mapping interfaces, we need to transform them to standard geographic projections and position them accurately on the Earth’s surface.

The transformation process involves creating a local coordinate reference system centred at the property location, transforming the metre-based coordinates from the survey instructions, and then converting to the target projection. Projection errors can cause boundaries to appear at incorrect angles or with geometric distortions, so precision in this step is essential.

An additional complexity is absolute positioning. Some descriptions include reference points with known geographic locations—PLSS corner references or intersections of named roads. These anchor points provide both position and bearing corrections, allowing the system to place the boundary accurately and correct for magnetic declination or other systematic errors in the original survey.

System Capabilities and Design Considerations

The legal description extraction system integrates several technical domains:

Language model orchestration for text decomposition and structured extraction
Graph-based workflows for managing complex multi-step processing
Parallel execution with isolated state management per component
Computational geometry including coordinate transformations and curve approximation
Geometric set operations for combining and subtracting boundary components
Validation and error reporting with partial result handling

Beyond the technical implementation, the domain itself presents unique challenges:

Limited training data: Property description formats vary significantly by jurisdiction and era, with no comprehensive standardised dataset
Precision requirements: Geometric accuracy directly impacts legal interpretation, where small errors can have significant consequences
Input variability: Historical documents span multiple eras of surveying practice, regional terminology, and varying levels of preservation quality
Subtle failure modes: Silent errors in boundary extraction are more problematic than obvious failures, requiring comprehensive validation

These constraints inform fundamental design decisions: when to rely on AI interpretation versus deterministic computation, how to handle ambiguity, and when to surface uncertainty to users rather than making assumptions.

Looking Forward

The system continues to evolve as we encounter new description formats and edge cases. Current work focuses on expanding support for additional description types, and improving agent reliability for complex mixed-format descriptions.

This problem space—applying AI and computational geometry to centuries-old legal documents—offers ongoing technical challenges at the intersection of natural language processing, spatial computing, and domain expertise. The work requires careful consideration of accuracy, reliability, and practical utility in a domain where errors have real-world legal implications.

We’re hiring

If you are interested in helping us solve some very hard problems building the world’s first AI-powered real estate lawyer, see our open roles and get in touch with us via our Careers Page. If nothing quite matches your experience please connect with and message me directly on LinkedIn, Andrei Vasilcoi, and I’d be happy to have a chat with you.