Databases and SQL for Data Science with Python | Coursera IBM |
Proficiency in SQL is essential for professionals in data-related roles such as Data Scientists, Data Analysts, and Data Engineers because databases store a vast amount of global data, and SQL serves as a robust means to interact with and extract data from these databases.
Throughout this course, you will gain comprehensive knowledge of SQL, starting from fundamental SELECT statements to advanced topics like JOINs. You will learn to write basic SQL commands including SELECT, INSERT, UPDATE, and DELETE, manipulate result sets using WHERE, COUNT, DISTINCT, and LIMIT clauses, and distinguish between Data Manipulation Language (DML) and Data Definition Language (DDL). Additionally, you will practice creating, altering, dropping, and loading tables, utilizing string patterns and ranges, and organizing result sets with ORDER and GROUP BY clauses along with database functions.
Moreover, the course covers building sub-queries, querying data from multiple tables, and accessing databases using Jupyter notebooks with SQL and Python for data science tasks. You will also delve into advanced SQL concepts like Stored Procedures, Views, ACID Transactions, and Inner & Outer JOINs through practical labs and projects.
Hands-on experience will be emphasized, involving constructing SQL queries, working with actual cloud-based databases, and utilizing real-world data science tools. The culmination of the course will involve analyzing multiple real-world datasets to showcase your SQL proficiency in a final project.
Notice!
Always refer to the module on your course for the most accurate and up-to-date information.
Attention!
If you have any questions that are not covered in this post, please feel free to leave them in the comments section below. Thank you for your engagement.
Graded Quiz: Basic SQL
SELECT DISTINCT FIRSTNAME FROM INSTRUCTOR
- LEON, PAUL, JOE
- LEON KATSNELSON, PAUL ZIKOPOLOUS, JOE SANTARCANGELO
- LEON, PAUL, LEON, JOE
- LEON, LEON, PAUL, PAUL
- Updates all rows in the table to have the last name 'Brewster.'
- Updates all rows in the table to have the last name 'Smith.'
- Changes the last name of all instructors named 'Smith' to 'Brewster.'
- Changes the last name of the instructor named 'Brewster' to 'Smith.'
- The command would result in an error.
- The command would only delete the first entry in the table.
- The command would remove all entries in the table, leaving it empty but still present in the database.
- The command would delete the table from the database.
SELECT COUNT(DISTINCT FIRSTNAME) FROM INSTRUCTOR
- The statement would throw an error.
- Only the distinct FIRSTNAME entries.
- The count of unique entries along with the distinct FIRSTNAME entries.
- The number of unique FIRSTNAME entries in the INSTRUCTOR table.
SELECT * FROM INSTRUCTOR WHERE LASTNAME='Smith' LIMIT 5
- The first 5 rows from the INSTRUCTOR table.
- The last 5 rows from the INSTRUCTOR table.
- The last 5 entries in the INSTRUCTOR table where LASTNAME is 'Smith.'
- The first 5 entries in the INSTRUCTOR table where LASTNAME is 'Smith.'
Relational DB Concepts and Tables
- A database is a logically coherent collection of data with some inherent meaning
- Data can only be added and queried from a database but not modified.
- Only SQL can be used to query data in a database.
- All of the above
- columns
- constraints
- rows
- keys
- DML statement
- DDL statement
- DQL statement
- All of the above
- ALTER table command
- CREATE command
- DROP table command
- TRUNCATE table command
- ALTER Employees TABLE ADD ID char
- ALTER TABLE COLUMN Employees ID char(7)
- ALTER TABLE Employees ADD ID char(7)
- ALTER Employees ADD COLUMN ID varchar(7)
Graded Quiz: Refining Your Results
- SELECT lastname from author where lastname like ‘J*’
- SELECT lastname from author where lastname like ‘J$’
- SELECT lastname from author where lastname like ‘J%’
- SELECT lastname from author where lastname like ‘J#’
- SELECT ID FROM TABLE_NAME ORDER BY ID
- SELECT * FROM TABLE_NAME ORDER BY ID
- SELECT ID FROM TABLE_NAME ORDER BY ID DESC
- SELECT * FROM TABLE_NAME ORDER BY ID DESC
- Acts as an alternative to WHERE clause in SQL queries.
- It may not necessarily organize the result set in a specific order.
- Check whether data records meet the specified condition is met or not.
- Restricts the result set for a query using GROUP BY clause.
SELECT * FROM employees ORDER BY emp_name LIMIT 5;
- Retrieves the entire contents of the table, sorted alphabetically based on emp_names
- Retrieves the top 5 emp_names ordered alphabetically.
- Retrieves all the columns of the top 5 rows of the table, sorted reverse alphabetically based on emp_names
- Retrieves all the columns of the top 5 rows of the table, sorted alphabetically based on emp_names
- SELECT COUNT(CustomerID), Country FROM Customers GROUP BY Country HAVING COUNT(Customers) > 5;
- SELECT COUNT(CustomerID), Country FROM Customers GROUP BY Country HAVING COUNT(CustomerID) < 5;
- SELECT COUNT(CustomerID), Country FROM Customers GROUP BY Country HAVING CustomerID > 5;
- SELECT COUNT(CustomerID), Country FROM Customers GROUP BY Country HAVING COUNT(CustomerID) > 5;
Graded Quiz: Functions, Multiple Tables, and Sub-queries
- SELECT * FROM EMPLOYEES WHERE DEP_ID = MAX(DEP_ID)
- SELECT * FROM EMPLOYEES WHERE DEP_ID = ( SELECT DEPT_ID_DEP FROM DEPARTMENTS WHERE DEPT_ID_DEP IS MAX )
- SELECT * FROM EMPLOYEES WHERE DEPT_ID_DEP = MAX ( SELECT DEPT_ID_DEP FROM DEPARTMENTS )
- SELECT * FROM EMPLOYEES WHERE DEP_ID = ( SELECT MAX(DEPT_ID_DEP) FROM DEPARTMENTS )
- SELECT D.F_NAME, E.DEP_NAME FROM EMPLOYEES E, DEPARTMENTS D WHERE D.DEPT_ID_DEP = E.DEP_ID
- SELECT E.F_NAME, D.DEP_NAME FROM EMPLOYEES, DEPARTMENTS
- SELECT F_NAME, DEP_NAME FROM EMPLOYEES, DEPARTMENTS WHERE DEPT_ID_DEP = DEP_ID
- SELECT F_NAME, DEP_NAME FROM EMPLOYEES E, DEPARTMENTS D WHERE E.DEPT_ID_DEP = D.DEP_ID
- SELECT SUM(Cost) FROM PetRescue
- SELECT SUM(Cost) AS Total_Cost FROM PetRescue
- SELECT SUM(Total_Cost) From PetRescue
- SELECT Total_Cost FROM PetRescue
- SELECT (CURRENT_DATE – DOB) FROM Employees
- SELECT DATEDIFF(CURRENT_DATE, DOB) FROM Employees
- SELECT FROM_DAYS(DATEDIFF(CURRENT_DATE, DOB)) FROM Employees
- SELECT FROM_DAYS(DATE_SUB(CURRENT_DATE, DOB) FROM Employees
- SELECT NAME, DATE_ADD(DOM, INTERVAL 1 YEARS) AS DOE FROM MEDS
- SELECT NAME, DATEADD(DOM, INTERVAL 1 YEAR) AS DOE FROM MEDS
- SELECT NAME, DATEADD(DOM, INTERVAL 1 YEAR) FROM MEDS
- SELECT NAME, DATE_ADD(DOM, INTERVAL 1 YEAR) AS DOE FROM MEDS
Graded Quiz: Accessing databases using Python
- %sql sqlite:///EMP.db
- %sql
- sqlite:///EMP.db
- %sql sqlite:/EMP.db
- %sql sqlite3://EMP.db
- Coding in Jupyter notebook using a programming language other than Python
- Converting Jupyter notebook’s default programming language to a desired one.
- Timing a complete cell block as per requirement.
- Load an SQL database to a jupyter notebook
import sqlite3import pandas as pdconn = sqlite3.connect(‘HR.db’)data = pd.read_csv(‘./employees.csv’)data.to_sql(‘Employees’, conn)
- The csv file is read and converted into an SQL table ‘Employees’ under the HR database
- The CSV file is converted to an SQL file
- The code throws a syntax error message.
- CSV file is saved to the HR.db file created by the code.
- out = pandas.read_sql(query_statement, connection_object)
- out = dataframe.read_sql(query_statement, connection_object)
- cursor = connection.execute(query_statement)
- out = cursor.fetchall()
- out = connection.execute(query_statement)
- df.describe()
- df.head()
- df.tail()
- df.info()
Graded Quiz on Assignment
- 53
- 533
- 433
- 555
- SELECT COMMUNITY AREA NUMBER, COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE PER_CAPITA_INCOME<11000
- SELECT COMMUNITY_AREA_NUMBER, COMMUNITY_AREA_NAME WHERE CENSUS_DATA FROM PER_CAPITA_INCOME<11000
- SELECT COMMUNITY_AREA_NUMBER, COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE PER_CAPITA_INCOME<11000
- SELECT COMMUNITY_AREA_NUMBER, COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE PER_CAPITA_INCOME>11000
- 3
- 2
- 1
- 4
- SELECT * FROM CHICAGO_CRIME_DATA WHERE PRIMARY_TYPE = "KIDNAPPING"
- SELECT * FROM CHICAGO_CRIME_DATA WHERE DESCRIPTION LIKE “%CHILD%”
- SELECT * FROM CHICAGO_CRIME_DATA WHERE PRIMARY_TYPE = "KIDNAPPING" AND DESCRIPTION = “%CHILD%”
- SELECT * FROM CHICAGO_CRIME_DATA WHERE PRIMARY_TYPE = "KIDNAPPING" AND DESCRIPTION LIKE “%CHILD%”
- COUNT
- AVERAGE
- DISTINCT
- LIKE
- 48.0
- 46.42
- 49.62
- 49.52
- ORDER BY PERCENT_HOUSEHOLDS_BELOW_POVERTY DESC LIMIT 5
- ORDER BY PERCENT_HOUSEHOLDS_BELOW_POVERTY DESC
- ORDER BY PERCENT_HOUSEHOLDS_BELOW_POVERTY DESC LIMIT
- ORDER BY PERCENT_HOUSEHOLDS_BELOW_POVERTY LIMIT 5
- 23.0
- 25.0
- 20.0
- 36.0
- SELECT COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE HARDSHIP_INDEX IN (SELECT MOST(HARDSHIP_INDEX) FROM CENSUS_DATA);
- SELECT COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE HARDSHIP_INDEX AS (SELECT MAX(HARDSHIP_INDEX) FROM CENSUS_DATA);
- SELECT COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE HARDSHIP_INDEX IN (SELECT MAX(HARDSHIP_INDEX) FROM CENSUS_DATA);
- SELECT COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE HARDSHIP_INDEX IN (SELECT HARDSHIP_INDEX FROM CENSUS_DATA;
- Riverdale
- Austin
- Englewood
- Fuller Park
Final Exam
- The table
- A results set
- The index
- The database
- DELETE CUSTOMER_ID IS (‘B8’, ‘B9’) FROM CUSTOMERS
- DELETE (‘B8’, ‘B9’) FROM CUSTOMERS
- DELETE FROM CUSTOMERS WHERE CUSTOMER_ID IN (‘B8’, ‘B9’)
- DELETE FROM CUSTOMERS WHERE CUSTOMER_ID IS (‘B8’, ‘B9’)
- The primary key uniquely identifies each column in a table.
- The primary key uniquely identifies each row in a table.
- The primary key uniquely identifies each attribute in a table.
- The primary key uniquely identifies each relation in a table.
- Data Update Language (DUL)
- Data Manipulation Language (DML)
- Data Entry Language (DEL)
- Data Input Language (DIL)
- SELECT State, COUNT(State) FROM Representative
- SELECT State, DISTINCT(State) FROM Representative GROUP BY State
- SELECT DISTINCT(State) FROM Representative
- SELECT State, COUNT(State) FROM Representative GROUP BY State
SELECT First_Name, Last_Name, Age FROM Company
- IF Age >=30 AND Age <=50
- WHERE Age >=30 AND Age <=50
- WHERE Age > 30
- WHERE Age < 30
- SELECT MIN(SALARY) FROM EMPLOYEES
- SELECT SALARY FROM EMPLOYEES WHERE MINIMUM(SALARY) = SALARY
- SELECT LOWEST(SALARY) FROM EMPLOYER
- SELECT MAX(SALARY) FROM EMPLOYEES
- SELECT PRODUCT_NAME FROM PRODUCTS WHERE UNIT_PRICE = (SELECT MIN(UNIT_PRICE) FROM PRODUCTS)
- SELECT PRODUCT_NAME FROM PRODUCTS WHERE UNIT_PRICE IS LOWEST
- SELECT MIN(UNIT_PRICE) FROM PRODUCTS
- SELECT PRODUCT_NAME FROM PRODUCTS WHERE UNIT_PRICE = MIN
- Import
- Primary key
- Database cursor
- Connection
- df – data frame; Sample – sample information; conn – connection variable
- df – data format; Sample – sample information; conn – data connector
- df – data frame; Sample – table name ; conn – connection variable
- df – data format; Sample – table name ; conn – data connector