2.4.2 Module 2 Quiz – Getting started with Data Gathering and Investigation Exam Answers Full 100%  Data Analytics Essentials 2023

What term describes the key characteristics that are observed or measured as part of an experiment or analysis?
 variable
 data set
 statistics
 population
Answers Explanation & Hints: The term that describes the key characteristics that are observed or measured as part of an experiment or analysis is “variable”. A variable is any characteristic or factor that can be measured or controlled in a study or experiment. Examples of variables include age, gender, height, weight, income, test scores, and many others. In data analysis, variables can be classified as either independent or dependent. The independent variable is the one that is being manipulated or controlled by the researcher, while the dependent variable is the one that is being measured or observed as the outcome or effect of the independent variable.

Match the variable with the description.
 nominal ==> qualitative values based on identity of the object
 ordinal ==> qualitative values in order of ranking
 discrete ==> quantitative and finite set of values
 continuous ==> quantitative values with infinite range

Answers Explanation & Hints:  Nominal variable: This is a type of variable that describes data that is qualitative or categorical, meaning it is based on the identity of an object. For example, colors, gender, or type of product are all examples of nominal variables. They do not have an inherent order or ranking.
 Ordinal variable: This is a type of variable that describes data that is also qualitative or categorical, but the values are ordered in some way. For example, letter grades such as A, B, C, D, or E, or customer satisfaction ratings such as “very satisfied,” “somewhat satisfied,” “neutral,” “somewhat dissatisfied,” or “very dissatisfied” are examples of ordinal variables.
 Discrete variable: This is a type of variable that describes data that is quantitative and has a finite set of values. Discrete variables usually represent counts of items, such as the number of students in a classroom, the number of cars in a parking lot, or the number of items sold by a business.
 Continuous variable: This is a type of variable that describes data that is also quantitative but has an infinite range of possible values. Continuous variables are usually measurements of some kind, such as height, weight, or temperature. They can take on any value within a range, and they are often represented on a scale with decimal points.

What category of variables includes nominal or ordinal variables?
 numerical
 categorical
 ratio
 interval

Answers Explanation & Hints: Categorical variables include nominal and ordinal variables. Nominal variables represent categories without any inherent order or ranking, while ordinal variables represent categories that have a natural order or ranking. These categories cannot be meaningfully measured in terms of magnitude or distance. On the other hand, numerical variables include interval and ratio variables that are measurable in terms of magnitude or distance between values. Interval variables have equal intervals between values, but do not have a true zero point, while ratio variables have both equal intervals and a true zero point.

Which statement is an accurate description of discrete variables?
 Discrete variables are qualitative and consist of two or more categories in which order matters.
 They are quantitative with a continuous range of values.
 They are quantitative with a finite set of values.
 They are qualitative and consist of two or more categories of values in which order does not matter.

Answers Explanation & Hints: The accurate statement about discrete variables is: They are quantitative with a finite set of values. Discrete variables are numeric variables that can only take on a countable number of values within a certain range. These values are often integers or whole numbers, and the variable cannot take on any values between these set points.

Match the variable with the type of value.
 continuous ==> sales volume
 discrete ==> number of users
 ordinal ==> student class rank
 nominal ==> eye color

Answers Explanation & Hints:  Nominal variables: These are categorical variables that don’t have any inherent order or ranking. The values are used to represent different categories or groups. In the given options, “eye color” is a nominal variable because the different colors don’t have any inherent order or ranking.
 Ordinal variables: These are categorical variables where the values can be ranked in a specific order. In the given options, “student class rank” is an ordinal variable because the rankings can be ordered from highest to lowest.
 Discrete variables: These are numerical variables that can only take specific values, typically whole numbers. In the given options, “number of users” is a discrete variable because it can only take certain whole number values, such as 0, 1, 2, 3, etc.
 Continuous variables: These are numerical variables that can take any value within a certain range. In the given options, “sales volume” is a continuous variable because it can take any value within a certain range (e.g., $0.00 to infinity).

Which statement describes SQL?
 It is a visual analytics platform for exploring, analyzing, and managing data.
 It is a scripting language for creating dynamic web pages based on social media.
 It is a Google opensource product suitable for small datasets and quick data analysis.
 It is a powerful data management language that allows data analysts to interact with data stored in relational databases.

Answers Explanation & Hints: The statement “It is a powerful data management language that allows data analysts to interact with data stored in relational databases.” describes SQL. SQL stands for Structured Query Language, and it is a domainspecific programming language that is designed for managing and manipulating structured data in relational database management systems (RDBMS). It is commonly used by data analysts and database developers to retrieve, manipulate, and analyze data stored in relational databases. SQL is a standard language used in the majority of commercial database systems, and it supports a wide range of operations such as creating tables, inserting, updating and deleting data, and querying and filtering data based on specific criteria.

Which statement describes Tableau?
 It is an AWS product suitable for small datasets and quick data analysis.
 It is a visual analytics platform for exploring, analyzing, and managing data.
 It is a scripting language for creating dynamic web pages based on data sets.
 It is an online platform for data professional to share projects about big data collection and analysis.

Answers Explanation & Hints: Tableau is a powerful data visualization and business intelligence software that allows users to easily connect, visualize, and share data in a way that is both intuitive and interactive. With Tableau, users can quickly create dynamic and visually appealing charts, graphs, and other visualizations that help them better understand their data and make informed decisions. It also provides features for data blending, data analysis, and data management, allowing users to combine data from different sources, conduct statistical analyses, and clean and transform data. Tableau is widely used in a variety of industries, including finance, healthcare, retail, and more, and is popular for its userfriendly interface and ability to create complex data visualizations with ease.

Which two statements describe the Kaggle web site? (Choose two.)
 It provides many data sets with minimum charges.
 It provides many coding samples for creating dynamic advertisement products.
 It offers community machine learning competitions among data science enthusiasts.
 It enables data analysis tools integration such as Excel, SQL database, and Tableau.
 It is an online community for data scientists and machine learning enthusiasts to share ideas.

Answers Explanation & Hints: The correct statements are:
 It offers community machine learning competitions among data science enthusiasts.
 It is an online community for data scientists and machine learning enthusiasts to share ideas.
Kaggle is a platform for data science competitions and collaboration, where data scientists and machine learning enthusiasts can participate in challenges to solve realworld problems and share their solutions and ideas with the community. It also provides a variety of datasets and tools for data exploration and analysis. However, it does not provide coding samples for creating dynamic advertisement products or enable data analysis tools integration such as Excel, SQL database, and Tableau.

A data analyst is performing data cleaning on an Excel spreadsheet that contains collected data. Which Excel function would the analyst use to find integer data that is larger than expected?
 MIN
 MAX
 Trim
 Sort

Answers Explanation & Hints: The Excel function that could be used to find integer data that is larger than expected is MAX. The MAX function returns the highest value in a range of cells or an array, and can be used to identify outliers or unexpected values in a data set. In this case, the data analyst would use the MAX function to identify the largest integer value in the column of data and compare it to the expected range of values. If the largest value is larger than expected, it may indicate an error or outlier that needs to be investigated further.

A data analyst is performing data cleaning on an Excel spreadsheet that contains collected data. The data records have been marked with mixed formatting. The analyst wants to remove all formatting and reformat data. Which operation should the analyst perform to remove current formatting?
 Highlight data range > Home tab > Clear > Clear Formats
 Home tab > Clear Formats > Highlight data range > Clear All
 Highlight data range > Home tab > Clear Formats > Clear All
 Home tab > Clear Formats > Clear All > Highlight data range

Answers Explanation & Hints: This will clear the formatting of the highlighted data range, removing any mixed formatting that may exist. The analyst can then apply the desired formatting to the data.