QUESTION 1
Suppose that you are employed as a data mining consultant for an Internet search engine company. Describe how data mining can help the company by giving specific examples of how techniques, such as clustering, classification, association rule mining, and anomaly detection can be applied.
Answer:
QUESTION 2
Identify at least two advantages and two disadvantages of using color to visually represent information.
Answer:
QUESTION 3
Consider a group of documents that has been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?
Answer:
Question 4
Consider a group of documents that has been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?
(a) Is there a difference between the two sets of points? Please explain.
(b) If so, which set of points will typically have a smaller SSE for K=10 clusters?
(c) What will be the behavior of DBSCAN on the uniform data set?
Answer:
Question 5
Give an example of a data set consisting of three natural clusters, for which (almost always) K-means would likely find the correct clusters, but bisecting K-means would not.
Answer:
loose from his gaping jaws. As the fish slid out of the heavy-duty net and onto the weather-stained deck, I counted seven spots, the most of the day. With the 10-horsepower trolling motor propelling the boat slowly, just above the years of built up mud and oyster shells, we monitor the narrow channels, imprisoned by miles of alligator-infested reeds, for rosy, translucent tails trolling five feet off the shore.
Before heading back to Dothan after my family’s last trip to New Orleans this past Christmas, we stopped for brunch at Atchafalaya, the only restaurant in Nola with five A’s. Known as one of the top 10 brunch restaurants in the country, Atchafalaya is famous for their chicken and biscuits, so obviously, I accompanied that order with a cup of turtle and alligator gumbo. These chicken and biscuits may seem simple, but they aren’t even in the same category as the generic Hardees chicken biscuit. Two homemade buttermilk biscuits, topped with two whole fried chicken breasts, and doused in gravy, along with the gumbo that sounded like roadkill but tasted like Heaven, held up as the perfect last Nola meal before the five-hour trip back to Dothan.
The restaurant is named after the Atchafalaya Swamp, where the Atchafalaya River and Gulf of Mexico converge to form the largest swamp in the United States. This swamp is the only growing delta system left in Louisiana, with wetlands that are almost stable, and making up more than 35% of the Mississippi River Delta, it’s larger than the Florida Everglades. With over 500 different species of wildlife, 22 million