Data Analysis with R | Coursera Quiz Answers

Data Analysis with R | Coursera IBM

The R programming language is specifically designed for data analysis. It serves as a crucial tool for bridging the gap between the data-related problems you aim to solve and the answers needed to achieve your goals. This course begins with a question and guides you through the process of answering it using data. Initially, you will learn essential techniques for preparing your data for analysis. Following this, you will explore your data through exploratory data analysis, which helps you summarize your data and identify significant relationships between variables that can provide insights. Once your data is ready, you will develop a model and learn how to evaluate and refine its performance. This structured approach ensures your data analysis meets your standards and gives you confidence in the results.

In this course, you will gain practical experience by acting as a data analyst working with airline departure and arrival data to predict flight delays. Utilizing the Airline Reporting Carrier On-Time Performance Dataset, you will practice reading data files, preprocessing data, creating and improving models, and evaluating them to select the best one.

Enroll Course

Notice!
Always refer to the module on your course for the most accurate and up-to-date information.

Attention!
If you have any questions that are not covered in this post, please feel free to leave them in the comments section below. Thank you for your engagement.

Week 01 Quiz Answers

Graded Quiz Answers

1. What is the purpose of the Data Asset eXchange?

Provides data that you can explore to conduct data analysis.
Provides data that you can use for a small fee.
Helps you exchange data with others.
Provides data that is only useful for learning purposes.

2. In the Airline Performance dataset from the Asset Data eXchange, which of the following variables is a target for predicting on-time arrivals?

CarrierDelay
Distance
SecurityDelay
ArrDelay

3. What is the purpose of the pipe (%>%) operator?

Assigns a value to a variable.
Assigns a value to a global variable.
Combines two functions into a single operation.
Combines multiple functions into a single operation.

4. Which function can you use to read a text file that uses the “%” character as a delimiter?

read_delim()
read_tsv()
read_csv()
read_any()

5. What is the main similarity between the summarize() and group_by() functions?

Both return a statistical summary of the data.
Both group data by the specified variables.
Both compute summary statistics.
There is no similarity between the summarize() and group_by() functions.

Week 02 Quiz Answers

Graded Quiz answers

1. You want to access the “Date” column of a data frame called sales_data so you can perform an operation on it. What is the correct way to refer to this column?

sales_data%Date
sales_data$Date
sales_data.Date
sales_data#Date

2. Which function replaces missing values in a dataset?

drop_na()
replace_na()
is.na()
drop_columns()

3. You have a variable called “Status” that contains a status code in the format “error_type-severity_level”, for example “10-07”, and you want to reformat the column so that the “error_type” and “severity_level” are in different columns. What is the correct function to do this?

dataframe %>% mutate_if(Status, sep = “-“, into = c(“error_type”, “severity_level”)

dataframe %>% separate(Status, sep = “-“,into = c(“error_type”, “severity_level”)

dataframe %>% mutate_all(Status, sep = “-“,into = c(“error_type”, “severity_level”)

dataframe %>% sapply(Status, sep = “-“,into = c(“error_type”, “severity_level”)4. What are two benefits of data normalization?

Helps you better understand data distribution.
Brings data into a common standard of expression that allows you to make meaningful comparisons.
Minimize the effects of outliers, which can influence the result more.
Enables a fair comparison between the different features and making sure they have the same impact.

5. To visualize its distribution, binned data is often plotted in which of the following type of chart?

Scatter plot
Histogram
Line chart
Bar chart

6. Which of the following can you accomplish using the spread() function? Select two answers.

Reformat the categorical variable that its contents are in two or more columns.
Convert categorical variables to dummy variables.
Convert categorical variables to dummy variables and assign the value of another variable to each category.
Size down three variables into one.

Week 03 Quiz Answers

Graded Quiz Answers

1. Which of the following forms of exploratory data analysis generates short summaries about the sample and measures of the data?

Correlation
Pearson correlation
Analysis of variance (ANOVA)
Descriptive statistics

2. When conducting exploratory data analysis, which visualizations are particularly useful for plotting the target variable over multiple variables to get visual clues of the relationship between these variables and the target.

Scatter plots
Histograms
Heatmaps
Boxplots

3. Which of the following statements about the ANOVA F-test score are true? Select two answers.

A large F-test score implies a strong correlation between variable categories and the target variable.
A large F-test score implies a poor correlation between variable categories and the target variable.
A small F-test score implies a strong correlation between variable categories and the target variable.
A small F-test score implies a poor correlation between variable categories and the target variable.

4. You can visualize the correlation between two variables by plotting them on a scatter plot and then doing which of the following?

Nothing. The scatter plot alone can show the correlation completely.
Add a correlation line.
You should not use a scatter plot for visualizing the correlation between two variables.
Add a regression line.

5. When using the Pearson method to evaluate the correlation between two variables, how do you know you can have strong certainty in the result?

The P value is greater than 0.1.
The P value is less than 0.05.
The P value is less than 0.1.
The P value is less than 0.001.

Week 04 Quiz Answers

Graded Quiz Answers

1. In model development, you can develop more accurate models when you have which of the following?

Relevant data.
Larger quantities of data.
Fewer independent variables.
More dependent variables.

2. Assume you have a dataset called “new_dataset”, a predictor variable called X, and a target called Y, and you want to fit a simple linear regression model. Which command should you use?

linear_model <- predict(Y ~ Z, data = new_dataset)
linear_model <- lm(X ~ Y, data = new_dataset)
linear_model <- lm(Y ~ X, data = new_dataset)
linear_model <- predict(X ~ Y, data = new_dataset)

3. When using the predict() function in R, what is the default confidence level?

95%
100%
85%
90%

4. Which plot type helps you validate assumptions about normality?

Q-Q plot
Residual plot
Scale-location plot
Regression plots

5. A third order polynomial regression model is described as which of the following?

Quadratic, meaning that the predictor variable in the model is squared.
Cubic, meaning that the predictor variable in the model is cubed.
Squared, meaning that the predictor variable in the model is squared.
Simple linear regression.

6. How should you interpret an R-squared result of 0.89?

The X variable causes the Y variable to positively change 89% of the time.
89% of the response variable variation is explained by a linear model.
There is a strong negative correlation between the variables.
89% of the response variable variation is explained by a polynomial model.

7. When comparing linear regression models, when will the mean squared error (MSE) be smaller?

When using a simple linear regression (SLR) model.
When using a polynomial regression model.
When using a multiple linear regression (MLR) model.
This depends on your data. The model that fits the data better has the smaller MSE.

Week 05 Quiz Answers

Graded Quiz Answers

1. Which situations are helped by using the cross-validation method to train your model? Select two answers.

Working with models with small amounts of data.
Determining if a model can be generalized for a broader group.
Working with models with large amounts of data.
Working with models that are underfit.

Reduce model complexity.
Use regularization.
Increase model complexity.
Reduce the number of features in the training data.

3. What is the difference between Ridge and Lasso regression?

Ridge regression penalizes the sum of the absolute values of the coefficients while Lasso regression penalizes the sum of squared coefficients.
There is no major difference between Ridge and Lasso regression.
Lasso regression penalizes the sum of the absolute values of the coefficients while Ridge regression penalizes the sum of squared coefficients.
Lasso regression increases or decreases the value of Lambda to penalize complex models more or less.

4. Which tidymodels function do you use to create the grid for a grid search?

tune()
grid_regular()
tune_grid()
add_model()

TeamsCloud

Data Analysis with R | Coursera Quiz Answers

Week 01 Quiz Answers

Graded Quiz Answers

Week 02 Quiz Answers

Graded Quiz answers

Week 03 Quiz Answers

Graded Quiz Answers

Week 04 Quiz Answers

Graded Quiz Answers

Week 05 Quiz Answers

Graded Quiz Answers

Post a Comment

Project Management Capstone | Coursera Quiz Answers

Practice Exam for CAPM Certification | Coursera Quiz Answers

Coursera IBM Professional Certificates

Inclavare Containers and Confidential Computing (Exam)

React Native

TeamsCloud