Niall's Data Blog

A Data Engineer / Architect writing about Tech, Data and the Community

Associative Grouping using Spark - Part 3

This is part of series of posts about associative grouping: Part 1 - Associative Grouping using tSQL Recursive CTE’s Part 2 - Associative Grouping using tSQL Graph In the first two parts of this series we looked at how we could use recursive CTE’s and SQL Server’s graph functionality to find overlapping groups in two columns in a table, in order to put them into a new super group of associated groups.