Detection and treatment of outliers.

 

 

 

Develop a better understanding of the
HBAT dataset more specifically to explore the characteristics of its customers and the
relationship between their perception of HBAT and their actions towards HBAT.
Make sure to address all the following questions.
1. [2.5 points]. Run a thorough univariate and bivariate graphical and statistical
examination of your data. Do you notice any irregularities in your data? How does your
data look like? Normal, skewed?
Tips. Make sure to number and label each table and graph. (e.g., Table 1. Summary
statistics for HBA missing data set).
Provide a title and a detailed interpretation of each chart.
Remember the rule of thumb: for any table and/or graph, you can have at most 3
genuine findings. If you have more, you have probably made them up (����).
2. [2.0 points]. Missing values analysis. Do you have any missing values in your data? If
yes, determine the extent of missing values per variable and case. Are there any
variables/cases that you need to delete? Use ≥ 30% of missing values as the threshold
for deletion.
a. After deleting variables and/or cases with ≥ 30% missing data, construct a
summary statistic for your data. Do you still observe any missing values? If yes,
decide on how to impute these missing values. Limit your imputation
technique to mean or median substitution. Justify your choice.
3. [2.0 points]. Detection and treatment of outliers.
Math and Statistics for Analytics Nov 2020
3
Are there any univariate outliers in your dataset? Use both the Tukey´s fences and z
score approach (with z threshold set at 2.5 since you have a small sample size) to
identify them. Do you notice any discrepancy between the two methods? Explain.
How many values were detected as outliers? Will you keep these outliers or delete
them? Justify your decision. Discuss impact of your decision on remaining data
analysis.
4. [2.5 points]. After treating missing values and outliers, construct a summary statistic of
your data. Compare and contrast your results with question 1. Develop two
hypothetical questions that you can answer using graphical and/or empirical
techniques. Provide correct answers.
5. [1.0 point]. In the case that you did not treat missing values and/or outliers, what
would be the impact on subsequent data analysis.
Dataset: HBAT Industries Dataset – HBAT missing. (You can access the dataset in Additional
Documentation / Assignment 1)
Context: HBAT is a manufacturer of paper products. Hypothetical dataset based on surveys of
HBAT customers completed on a secure web site managed by an established marketing
research company.
Sample size: 70 observations on 14 separate variables based on a market segmentation study
of HBAT customers: newsprints industry and the magazine industry.
Categories of data:
• Numerical variables: V1 to V9.
• Categorical variable: V10 to V14.
Additional information related to the variables is available in the excel file (HBAT missing) in
the Metadata spreadsheet.
Pre-requisite: before working on this assignment, you need to watch the series of videos: Data
examination – Excel (Campus Online / Additional Documentation / Module 2 / Recorded
Videos)
For detecting outliers, you are already familiar with the boxplot method as well as the Tukey´s
fences (video), you can also use the z score approach, to calculate the z value for each
observation, you can use Excel built-in function (STANDARDIZE).
For missing value analysis, you can use the COUNT function to count the numbers of cells
containing data in a range that contains numbers and use it to determine the extent of missing
values per case and per variable.
Additional resources:
Introduction to data analysis in Excel: https://www.youtube.com/watch?v=Rs4082ewxgA
Introduction to Pivot tables: https://www.youtube.com/watch?v=9NUjHBNWe9M

 

 

 

 

 

Sample Solution

have instated a Communist regime, was widely spread and, as Folch-Serra argues ‘systematically enforced through schools and textbooks, the pulpit, the Fascist institutions and the media’ (p. 228). There was heavy censorship of news that could have challenged this image, which Folch-Serra shows was ‘illustrated by the Spanish media’s disregard of the Nobel prizes awarded to Juan Ramón Jiménez for literature in 1956 and Severo Ochoa for science in 1959’ (p. 229). This leads on to the contradictory nature of Franco’s treatment of the Republicans since, as well as spreading defamatory comments about their nature, there was also, as Folch-Serra explains, a ‘suppression of information about their fate and whereabouts’ (p. 229) which drew from a ‘deliberate policy of oblivion and silence’ (p. 229). By winning the Civil War, Franco also won the fortune of being able to rewrite history and, as Folch-Serra confirms, he was able to ‘concoct a uniform image of the defeated as one and the same’ (p. 227). Amongst other forms of propaganda, education allowed Franco to disseminate his version of events as truth, which can be seen through school textbooks which Xavier Laudo elaborates on how they ‘spoke of the desertion of Republican soldiers’ as well as presenting Republican Spain as the ‘enemy within’ (p. 442) who were ‘responsible for the erosion of the nation’s Christian faith’ (p. 442). Assmann further shows how this ‘one sided version of history’ (p. 64) not only ‘protected’ (p. 65) and legitimised Franco, but also ‘prolonged the enemy stereotype of the murdered communists and democrats’ (p. 65). Thus, it can be seen that Franco manipulated the memory of the Civil War during his dictatorship and how his policies towards the Republicans after the war allowed him to promote his narrative as the truth and legitimise his position. This collective amnesia that Franco wanted to induce, discredited and erased his opponent from history. However, Assmann adds that this ‘silence did not dissolve the memory of the traumatic past’ (p. 66) and did not fully discredit his opponents, as individual memories of the events were ‘materially preserved in the earth and in families’ (p. 66). Memory also featured heavily in Franco’s propaganda, with many references made to returning Spain to the greatness it had once experienced. Franco’s message regarding the Republicans was spread through education and Laudo explains that so was the image of the Civil War as a ‘crusade’ (p. 438) such as during the Middle Ages. Zheng Wang describes how school textbooks can be used as ‘instruments for glorifying the nation, consolidating its national identity and justifying particular forms of social and political systems ‘ and how the rewriting of school textbooks can be used to ‘legitimise the new regime’ (p. 45). This is evident on the front cover of El Libro de España, which features a boat sailing across the globe, against the backdrop of the Spanish flag. This reminds the viewer of the Spanish Empire, as Laudo confirms, ‘stressing the cross-Atlantic colonialist adventures in the Americas’ (p. 443), and the power and glory that this brought, ‘promoting a spirit of patriotism’ (p.443). Through this, Laudo explains that Franco was able to propagate his ‘vision of Spain’s history, its Hispanic mission for imperial glory’ (p. 453). Religious references were frequently seen in Franco’s propaganda, and comparisons were made to the Catholic monarchs and the unity and greatness Spain experienced under them. Miriam

This question has been answered.

Get Answer
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
👋 Hi, Welcome to Compliant Papers.