Data Analysis

 

 

The provided dataset contains 569 data instances. Each data instance has 30 features that are
computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe
the characteristics of the cell nuclei present in the image. (UCI Dataset). Each instance is
classified into malignant or Benign.
Name your Jupyter notebook as Assignment2.
Split the dataset into 70% training and 30% test and provide the following experiments. Report
the accuracy for the test set as a performance measurement for all the following tasks
1. Use “from sklearn.tree import DecisionTreeClassifier”.
a) Train a DT classifier with Entropy (C1) and GINI (C2) and compare the performance.
b) Visualize the C1 and C2 by using the “graphviz” library
c) Prune C1 and C2 by limiting the depth and compare their performance with the
unpruned versions.
d) Use depth 1,…,20 and plot the performance for C1 and C2 separately. e) Choose the
best value for depth and visualize C1 and C2.
2. Use “from sklearn.ensemble import RandomForestClassifier”.
a) Train an RF classifier with 10 estimators and compare the performance for the test
set with C1.
b) Change the number of estimators from 10,50,100,500, 1000, and plot the
performance.
c) Perform 5 fold cross-validation and report the performance for RF classifier with 50
estimators
d) Plot the feature importance for RF with 200 estimators using the mean decrease in
and also feature permutation and explain the plots.
3. Use “from sklearn.ensemble import AdaBoostClassifier”
a) Train a classifier with 10 estimators and compare the performance with C1 and RF in
2a.
b) Change the number of estimators from 10,50,100,500, 1000, and plot the
performance.
c) Perform 5 fold cross-validation and report the performance for classifier with 50
estimators
4. Use “from sklearn.naive_bayes import GaussianNB”
a) Train a classifier and compare the performance for the test set with C1 and 2a and
3a.
5. Use PCA and print the Cumulative proportion. Using Cumulative proportion, only keep the
features that account for more than 95% (ratio of variance to keep) of the total variation
associated with all the original variables.
a) Train an RF classifier with 100 estimators using the dataset with reduced features and
compare the performance with RF with 100 estimators using all the features.

Sample Solution

came to power in 1979 and represented for many, laissez-faire economics and individual self-determination (Steele, 2018). She believed in power of the market, utilizing it to restore the stagnant British economy and moving away from state provided services. In 1979, cuts resulted in reducing the standard rate of tax from 33% to 30%, the top rate from 83% to 60% and finally cutting public spending by 3% (Bolick, 1995). She reduced the amount of public spending, from 50% to 43%. Thatcher felt high taxes discouraged the incentive to work however, effects of tax cuts increased income inequality through as high earners saw ‘the top 10%- did far better, with their incomes increasing from the equivalent of £472.98 in 1979 to £694.83 in 1990’. The uneven distribution of wealth saw the poorest families receive the least. Reductions in public expenditure affected health, education and social services which created a knock-on effect with substantial loss of public sector jobs resulting in decreased spending on goods and services. Privatisation became Thatcher’s most important and long-lasting legacy. She revealed in her memoirs that it was crucial for ‘reversing the corrosive and corrupting effects of socialism’ Parker. In the 1980-90s, due to fiscal pressures, Thatcher’s conservative views on private ownership and public discontent with the current regime saw the privatisation of public owned entities. For example, the sale of just ‘over 50% of shares in BT and the sale of British Energy in 1996’ (Berrington, 1998). Other privatised industries included electricity, gas, British steel, public bus transportation and other public services. As a result, workforces declined as ‘employment in the electricity and gas industries was cut in half’(Edwards, 2017), problems arose in the regulation of private monopolies to prevent abuse of power, however improved ‘economic growth and improved living standards as privatised businesses cut costs, increased service quality’ (Edwards, 2017). Thatcher can be seen as the key instigator of the sweeping shift from traditional to ‘New Public Management’ initiated by public service reforms. NPM involved the adoption of private sector management ideas to improve structures and processes in the public sector. Thatcher who led the 1980s ‘New Right’ administrations, that put a ‘shrinking government and reduced taxation on the agenda’ (Ferlie, 2017). Thatcher also wanted to remove ‘inefficiency in the state bureaucracy and the deprivilege of the civil service’ as she concluded that the public sector was ‘wasteful, overbureaucratic and underperforming’ (Ferlie et al., 1996). Thatcher wanted to identify areas of waste and inefficiency in the government and ‘improve service quality and customer-orientated service’ (Pollitt, 1996) whilst reducin

This question has been answered.

Get Answer
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
👋 Hi, Welcome to Compliant Papers.