Analytical Data Processing in SQL
You want to be proficient with data processing in SQL, but you don't know where to start!
There are too many topics to learn to get proficient at analytical data processing in SQL, like optimizing queries, partitioning, parallelism, data modeling, best practices, etc. It is overwhelming to have so many topics to learn! And even if you understand them, you're not sure if you can consider yourself "proficient in SQL". You want to be skilled in SQL and progress in your data career, but you don't know what to start learning and, more importantly, how to apply what you learn.
The most common recommendation to get better at SQL is "go find a problem and solve it", but where do you find a good problem, and how do you know if you solved it correctly?
You complete online SQL tutorials, but they don't explain how to optimize SQL, consider tradeoffs between different approaches, understand the data, or convert business questions to SQL queries. You keep hearing, "spend a couple of years working with SQL, and you'll be proficient," but you don't have years to get good at SQL!
By being proficient in SQL, you can process data efficiently, write clean/easy-to-understand SQL code, help your colleagues speed up their slow queries, answer business questions with data, quickly solve SQL interviews, and be recognized by your colleagues for your SQL expertise. You don't know how to go about achieving SQL proficiency.
Imagine having knowledge of how SQL works internally, your SQL queries will be accurate, efficient and fast! and you'd be able to solve complex questions with ease!
But what if you could? What if you understood how an OLAP DB engine stores and processes data? You'll be able to process large amounts of data efficiently. Your colleagues will praise you for writing easy-to-understand and clean SQL code. Your stakeholders and boss will be thrilled to have you on their team.
Confidently go from data questions to solutions superfast!
You will learn how to understand your company's business processes. How much clearer would things be if you knew how data flows through your company's systems, where they originate from, at what frequency they land in your warehouse, their common issues, and how to join them across different datasets? You'll be able to easily convert a data question to efficient SQL query(s). You will be the go-to person to wrangle large amounts of data!
Take your data career to the next level with SQL mastery!
What if you knew how to use window functions to replicate a for loop in SQL? You will use the appropriate data encoding and partitioning to make the SQL engine process data efficiently. You will know how to use a query planner to make a query performant.
What if you knew how to use SQL concepts (Windows, query optimization, CTEs, data storage, parallelization, etc.) effectively? Your high-performant SQL queries for data pipelines can save the company thousands of dollars! Being proficient at SQL puts you at the top of the data engineering talent pool. Be in high demand, get higher pay, and work on challenging & exciting problems. Start learning this future-proof skill set that is crucial for any business - right away!
It's true that learning how to choose the proper technique, utilize data partition/storage formats, and balance code optimization with complexity in sql can be challenging, time-consuming, and requires a lot of work experience, but it doesn't have to be.
Master data processing in SQL with the help of my e-book!
Learn how to process large amounts of data efficiently in SQL with my book "Analytical data processing in SQL" Understand how to use partitions, columnar storage formats, CTEs, windows, group bys, and filter push-down to compose efficient SQL queries. Breeze through your SQL interviews by understanding the business processes & data flows, creating efficient solutions, and explaining tradeoffs between multiple approaches.
Write efficient SQL queries, produce correct and fast results, and understand data warehouse core concepts.
Write efficient SQL by understanding OLAP DB's core concepts
You will learn to think in SQL, efficiently process big data, understand the OLAP DB engine's core concepts, and quickly answer common business questions. Write efficient SQL queries with total confidence!
"The kind of stuff you pay college to teach you but is never taught" - Reader
Learn SQL with my e-book "Analytical data processing in SQL" You'll be writing efficient and easy-to-read SQL queries and advancing your data career immediately!
Purchase Analytical Data Processing in SQL
Analytical Data Processing in SQL
140 Pages, E-book Format
Here's what you will learn
The basics of analytical processing
How to reduce data movement, to speed up queries.
How to read query plans and tune them
How to use advanced analytical concepts like: Window functions (with rows, range, groups) and CTEs
Templates for common analytical questions, such as deduping, pivoting, period-over-period calculations
Every chapter comes with code examples and exercises (with solutions)!
Analytical Data Processing in SQL: Table of Contents
Preface
* How this book is structured
* How to ask a question
* Acknowledgments
Prerequisites
* Lab setup
* The data model used in this book
* Basic SQL knowledge
Chapter 1: Understand your data; it's the foundation for data processing
* Relate the data model to the business by understanding facts and dimensions
* Modifying granularity (aka roll up/Group by) is the basis of analytical data processing
* Understanding what the data represents and the pipeline that generates the data helps you answer most business questions
Chapter 2: Save time & money by storing data efficiently and reducing data movement(while querying)
* OLAP DB stores the large datasets as chunks of data and processes them in parallel
* Reduce data movement (data shuffle) by understanding narrow & wide transformations
* Hash joins are expensive & Broadcast joins are less expensive
* Examine how the OLAP DB will process your query with EXPLAIN & then optimize it
* Reduce the amount of data to be processed with column-oriented formatting (with Parquet), partitioning, & bucketing
Chapter 3: Calculate rolling metrics, rank rows & compare values across rows with window functions
* Window = A set of rows identified by values present in one or more column(s)
* A window definition has a function, partition(column(s) to group by), and order(order of rows within the Group)
* Calculate running metrics with Aggregate functions
* Rank rows based on column(s) with Ranking functions
* Compare column values across rows with Value functions
* Define the rows you want to work on within a Window frame using ROWS, RANGE, & GROUPS
* Use query plan to decide using window function, when performance matters
Chapter 4: Write easy-to-understand SQL with CTEs, and answer common business questions with sample templates
* Use CTEs to make hard-to-read queries easy-to-understand & prevent re-processing of data
* Templates for De-duping, Pivots, Period-over-Period(YoY, MoM), & GROUPing BY multiple column combinations in one query
Appendix: SQL Basics: Analyze data in your warehouse
* The hierarchy of data organization is a database, schema, table, columns, & data types
* Use SELECT...FROM, LIMIT, WHERE, & ORDER BY to read data from tables
* Create a table
* Change data types (CAST) and handle NULLS (COALESCE)
* Replicate IF..ELSE logic with CASE statements
* Combine data from multiple tables using JOINs (there are different types of JOINS)
* Stack tables on top of each other with UNION or UNION ALL, subtract tables with EXCEPT
* Generate metrics for your dimension(s) using GROUP BY
* Use the result of a query within a query using sub-queries
* Save queries as views for more straightforward reads
* Use inbuilt DB functions for String, Time, & Numeric data manipulation
FAQ
Is this book beginner-friendly? Yes, Read the appendix SQL Basics chapter before starting the main chapters.
Refund policy? If you are unsatisfied, we offer a 30-day money-back guarantee. Please reach out to help@startdataengineering.com.
What is an OLAP DB? It is a database designed for analytical querying. Read this article for context on why and where to use an OLAP DB.
Does content in this book only apply to OLAP DBs? Besides Chapter 4, the other chapters apply to most SQL databases.
Does the book contain code examples? Yes, the book uses Trino to illustrate OLAP concepts. The book also includes examples and exercises to demonstrate the concepts you learn in each chapter. Please make sure you can run docker containers following the instructions here.
I have another question. Please send an email to help@startdataengineering.com.
I want to purchase more than 5 copies, is there a discount? Yes, for purchases of 5 or more books there is a discount. Please reach out to help@startdataengineering.com.