Niall's Data Blog

A Data Engineer / Architect writing about Tech, Data and the Community

Associative Grouping using Spark - Part 3

This is part of series of posts about associative grouping: Part 1 - Associative Grouping using tSQL Recursive CTE’s Part 2 - Associative Grouping using tSQL Graph In the first two parts of this series we looked at how we could use recursive CTE’s and SQL Server’s graph functionality to find overlapping groups in two columns in a table, in order to put them into a new super group of associated groups.

Associative Grouping using tSQL (Part 2)

This is part of series of posts about associative grouping: Part 1 - Associative Grouping using tSQL Recursive CTE’s Part 3 - Associative Grouping using Spark In part one of this series we looked at how we could use recursive CTE’s to find overlapping groups in two columns in a table, in order to put them into a new super group of associated groups. Since I wrote that post, SQL Server 2019 CTP 3.

Associative Grouping using tSQL (Part 1)

This is part of series of posts about associative grouping: Part 2 - Associative Grouping using tSQL Graph Part 3 - Associative Grouping using Spark Recently I was asked by a friend to have a look at an interesting query problem he had been looking at. He was trying to find overlapping groups in two columns, in order to put them into a new super group of associated groups. The simplest way to describe the problem is with an demo, so off we go…