Tidy data principles.

 

 

 

 

Step 1. Find a messy data from the Internet The data must violate at least one of the tidy data principles.

Any data format (csv, txt, excel, …) is okay, as long as you can read it into R.

However, you need to convert and save the data as a csv file and include it in your submission.
Each student will have unique data. No two students can use the same data. Once you find the data, double check that it is not used by another student. Please send it to me when you find the data. I need the data source link first before starting.

Take my confirmation and then proceed.

Step 2. Use R Markdown to achieve the following:

1. Specify author, date, and title in the YAML metadata of your document

2. Describe the data source, background, characteristics, variables, etc.

3. Load the data into R. Depending on the data format, you need an appropriate way to import the data.

4. Show and explain why the data is not tidy. Don’t use data that is already tidy.

5. Tidy up the data using dplyr and/or tidyr

6. Explain why the data is tidy now

7. Create two different & meaningful data visualization out of the tidy data using ggplot. (It is not enough to just change one variable in the axis.)

8. Identify the patterns in each plot and explain why they are meaningful

Resources to learn R Markdown:

https://r4ds.had.co.nz/r-markdown.html https://rmarkdown.rstudio.com/lesson-1.html http://www.rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf R Markdown: The Definitive Guide by Yihui Xie, J.J. Allaire, and Garrett Grolemund

Here are some additional notes about writing a RMarkdown report. Violating these rules may lead to a lower grade.

1. Put the data in the same folder as your Rmd file. Whenever we run/knit an RMarkdown file, it uses the folder with the Rmd file as the working directory.
2. Read the data in your Rmd code chunk using relative path. If you use an absolute path, I will not be able to knit the Rmd file to an html file from my end.
3. You will lose 5 points if for any reason (input path, error in code, etc.) the Rmd file cannot be knitted to an html file.
4. Distinguish headings (## heading) and normal text. We should not put all the text in headings.
5. Do not print excessive data in your RMarkdown report. Use kable to format tables, if you prefer.
6. Do not put your discussions/explanations in code chunk. Write them as normal text.
7. Do not use include=FALSE or echo=FALSE in your code chunk. I need to read your code. You may use message=FALSE, warning=FALSE to suppress messages/warnings.
8. Do not write an excessively long line of code. Break it into multiple lines to improve readability.

Step 3. Knit the R Markdown file (.Rmd) to an HTML file

Step 4. The Rmd, HTML, and csv files must follow the following naming rule:

Assignment1-YourLastName-Title With Six Words Or Less.FileExtension

For example:
Assignment1-Lin-Twitter Data Wrangling and Visualization.Rmd

Assignment1-Lin-Twitter Data Wrangling and Visualization.html

Assignment1-Lin-Twitter Data Wrangling and Visualization.csv

Step 5. If the csv file is larger than 5MB, remove some rows such that the file size is 5MB or less

Step 6. Submit the three files (individually)

 

 

Sample Solution

is evident that the ‘Spanish Ulcer,’ as it was referred to by Napoleon, had the greatest impact on his downfall due to his underestimation of opposition throughout the Napoleonic wars, Historian David Chandler agrees when he writes that “Napoleon’s policy in Spain proved one of his greatest blunders” and that “Nothing turned out as intended. From the beginning, he entirely misjudged the problem with which he had to deal. He never appreciated how independent the Spanish people were of their government; he misjudged the extent of their pride, of the tenacity of their religious faith, of their loyalty to Ferdinand. He anticipated that they would accept the change of regime without demur; instead he soon found himself with a war of truly national proportions on his hands.” This argument has much validity to it as Napoleon later revealed whilst in exile; “The unfortunate war in Spain ruined me. All my reverses originated there. The Spanish war destroyed my reputation throughout Europe, increased my difficulties and provided the best possible training ground for English troops. I trained the English army myself, in the Peninsula.” In this source Napoleon has the luxury of hindsight and an abundance of time to consider his mistakes as he was in exile in St. Helena, this could suggest that now more than ever Napoleon has a clearer understanding of the mistakes he made and so his argument is likely to be more accurate and therefore more valid. Previously he had hoped that by abolishing the old regime whilst “bearing the words ‘Liberty and Emancipation from Superstition’” he would be “regarded as the liberator of Spain” which helps to illustrate the arrogance that he was famous for, at the time the French empire was nearing its peak and Napoleon was famed across Europe as the greatest military tactician of all time. By showcasing his arrogance it could help to illustrate how easy it would be for him to misunderstand the Spanish, their culture and their loyalties. Subsequently the Spaniards and the Portuguese would resist French occupation and according to Gates “The imperial forces in the Peninsular totalled a massive 325,000 men but only about one quarter of these could be spared f

This question has been answered.

Get Answer
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
👋 Hi, Welcome to Compliant Papers.