MYSQL Dealing with sparse or missing data
Working with columns containg NULL values
In MySQL and other SQL dialects, NULL values have special properties.
Consider the following table containing job applicants, the companies they worked for, and the date they left the company. NULL indicates that an applicant still works at the company:
Your task is to compose a query that returns all rows after 2016-01-01, including any employees that are still working at a company (those with NULL end dates). This select statement:
fails to include any rows with NULL values:
Per the MySQL documentation, comparisons using the arithmetic operators <, >, =, and <> themselves return NULL instead of a boolean TRUE or FALSE. Thus a row with a NULL end_date is neither greater than 2016-01-01 nor less than 2016-01-01.
This can be solved by using the keywords IS NULL:
Working with NULLs becomes more complex when the task involves aggregation functions like MAX() and a GROUP BY clause. If your task were to select the most recent employed date for each applicant_id, the following query would seem a logical first attempt:
However, knowing that NULL indicates an applicant is still employed at a company, the first row of the result is inaccurate. Using CASE WHEN provides a workaround for the NULL issue:
This result can be joined back to the original example table to determine the company at which an applicant last worked:
These are just a few examples of working with NULL values in MySQL.