Finding duplicate records by matching name and nearby distance

I wanted to find potentially duplicate records in my data, based on having the exact same name and being geographically located within 500 meters of each other.

This worked:

with potential_duplicates as (
  select
    a.id as one,
    b.id as two,
    ST_Distance(a.point, b.point) as distance_m
  from location a, location b 
    where a.name = b.name
    and a.id > b.id
    and ST_Distance(a.point, b.point) < 500
)
select * from potential_duplicates

I'm using a CTE here because it makes it easy to further customize the output with an additional query.

A few tricks in here:

Created 2021-05-19T20:52:04-07:00 · Edit