Today is a pretty big day for advanced analytics on the Azure cloud, marked by a virtual event titled "Shape Your Future with Azure Data and Analytics[1]," which will feature Microsoft CEO Satya Nadella. During the event, the company will announce general availability (GA) of the latest version of its flagship cloud analytics service, Azure Synapse Analytics[2]. It will also announce the public preview of a new companion data governance service called Azure Purview.
I've reported on Azure Synapse before. It is both an evolution of the former Azure SQL Data Warehouse and a complete revamp of that service to include significant Apache Spark[3]-based data lake functionality. Synapse also sports integration with Azure Data Factory[4] for data prep/data engineering, Power BI[5] for business intelligence, Azure Machine Learning[6] for AI, Cosmos DB[7] and Azure Data Share[8]. Until today, the data lake features and these integrations were in public preview; starting today, they're GA.
Must read:
Got governance?
But while the Synapse GA is significant, it makes even more acute the need for a solid first-party data governance solution on the Azure cloud. Yes, there was some semblance of this in Azure Data Catalog[9] (ADC), but that service was more focused on metadata management than true data governance. While ADC could inventory, search and tag data sources, data sets and the columns/fields within them, it lacked important data classification and other governance capabilities, thus making it difficult to help customers comply with data protection regulations like the European Union's GDPR and California's CCPA.
To be fair, the first-party catalog offerings on Amazon Web Services (AWS) and Google Cloud have underwhelmed as well. Perhaps that's why