Voted Perceptron

            Problem 1. Voted Perceptron (50 points) For this assignment, we publish a CodaLab competition. Perceptron algorithm is one of the classic algorithms which has been used in machine learning from early 1960s. It is an online learning algorithm for learning a linear threshold function which works as a classifier. We will also apply the Voted Perceptron algorithm on the classification task in this assignment. Let D = {(xi, ti)}mi=1 be the training data, let xi be the feature vectors and ti ∈ {−1, +1} be the corresponding labels. The goal of the Perceptron algorithm is to find a vector w which defines a linear function such that ∀i, ti(w ⋅ xi) > 0. For the Voted Perceptron, you need a list of weighted vectors {(wk, ck)}K k=1 where wk is the vector and ck is its weight. (Please refer to https://cseweb.ucsd.edu/~yfreund/papers/LargeMarginsUsingPerceptron.pdf for a clear description of voted perceptron.) Then you can use all these vectors and their weight to do the classification: the predicted label ŷ for any feature vector would be K ŷ= sign(∑cksign(wkx)) k=1 Dataset For this problem, you can download the training data from Initiate k=1, c_1 = 0, w_1 = 0, t = 0; while t <= T do for each training example (x_i, t_i) do if y_i (w_k x_i) <= 0 then w_k+1 = w_k + t_i x_i; c_k+1 = 1; k = k + 1 else c_k += 1; end end t = t + 1; end It contains two CSV files: Xtrain.csv Each row is a feature vector. The values in the i-th columns are integer values in the i-th dimension. Ytrain.csv The CSV file provides the binary labels for corresponding feature vectors in the file Xtrain.csv . Submission format The final submission format should be - submission.zip - run.py - other python scripts you wrote Evaluation Criteria The final score will be a weighted combination of accuracy and F-1 score to evaluate your results: final_score = 50 × accuracy + 50 × F_score The competition will cover 80% for this problem. Note that to create a valid submission, please use the command zip -r submission.zip run.py [other python scripts] starting from the directory. DO NOT zip the directory itself, just its content. A sample run.py file is #!/usr/bin/env python # import the required packages here def run(Xtrain_file, Ytrain_file, test_data_file, pred_file): '''The function to run your ML algorithm on given datasets, generate the Parameters ---------- Xtrain_file: string the path to Xtrain csv file Ytrain_file: string the path to Ytrain csv file test_data_file: string the path to test data csv file pred_file: string the prediction file to be saved by your code. You have to save your ''' ## your implementation here # read data from Xtrain_file, Ytrain_file and test_data_file # your algorithm # save your predictions into the file pred_file # define other functions here Report (20%) In addition to the code submission on CodaLab Competitions, you are also supposed to write a PDF report and submit it on canvas. The report should solve the following question: First, use the last 10% of the training data as your test data. Compare Voted Perceptron on several fractions of your remaining training data. For this purpose, pick 1%,2%,5%,10%,20% and 100% of the first 90% training data to train and compare the performance of Voted Perceptron on the test data. Plot the accuracy as a function of the size of the fraction you picked (x-axis should be “percent of the remaining training data” and y-axis should be “accuracy”). Problem 2. KNN classifier (50 points) For this assignment, we publish another CodaLab competition. Dataset For this problem, you can download the training data from The training data contains two CSV files: Xtrain.csv Each row is a feature vector. The values in the i-th columns are float numbers in the i-th dimension. Ytrain.csv The CSV file provides the multi-class labels for corresponding feature vectors in the file Xtrain.csv . Please note the labels will be integer numbers between 0 and 10. Please note that there is neither header nor index in the CSV files. The program should use a vote among the k nearest neighbors to determine the output label of a test point; in the case of a tie vote, choose the label of the closest neighbor among the tied exemplars. In the case of a distance tie (e.g., the two nearest neighbors are at the same distance but have two different A sample training data Xtrain.csv is 3.1665,2.9837,2.9480 3.4507,3.1793,2.9028 A sample training label Ytrain.csv is 1 2 labels), choose the lowest-numbered label (e.g., choose label 3 over label 7). To determine distance/nearness in this problem, use Euclidian distance. As with the other problems, the output values should be in the appropriate order corresponding to the order of the testing data points. Submission format The final submission format should be Note that to create a valid submission, please use the command zip -r submission.zip run.py [other python scripts] starting from the directory. DO NOT zip the directory itself, just its content. A sample run.py file is - submission.zip - run.py - other python scripts you wrote Evaluation Criteria The final score will be accuracy evaluate your results: final_score = accuracy The competition will cover 80% for this problem. Report (20%) In addition to the code submission on CodaLab Competitions, you are also supposed to write a PDF report and submit it on canvas. The report should solve the following questions: #!/usr/bin/env python # import the required packages here def run(Xtrain_file, Ytrain_file, test_data_file, pred_file): '''The function to run your ML algorithm on given datasets, generate the Parameters ---------- Xtrain_file: string the path to Xtrain csv file Ytrain_file: string the path to Ytrain csv file test_data_file: string the path to test data csv file pred_file: string the prediction file name to be saved by your code. You have to save ''' ## your implementation here # read data from Xtrain_file, Ytrain_file and test_data_file # your algorithm # save your predictions into the file pred_file # define other functions here How will the accuracy vary across different selections of k, i.e., k = 1, 2, 3, 4...      
ow” information, Mary is able to recognize and remember the color red. If the Ability Hypothesis is true, Mary gains the ability to remember the experience of seeing red. After experiencing red for the first time, you can remember the experience, and therefore imagine the recreation of seeing red. Lewis also argues that another important ability gained is t`he ability to recognize. If Mary sees the color red again, she will recognize it immediately. Lewis uses the example of Vegemite. If you taste Vegemite at a later time, you will remember (or recognize) you have tasted it in the past. From this, you will be able to put a name to the taste experience. Lewis also argues that these abilities could originate from essentially anywhere – even magic. His main point is that experience, not lessons, is the best method of learning what a new experience is like. Overall, Lewis agrees that knowledge is gained from experiencing red, but believes the knowledge gained is “know-how” information, which is phenomenal, and therefore physicalism is valid. Lewis argues that information and ability are different physical knowledges – this is why physicalism can be true and consistent with the conclusion that Mary gains new knowledge. It is important to consider Lewis’ anti-qualia argument. Although the Ability Hypothesis may seem persuasive to David Lewis, there are several weaknesses. First, when we are shown an unfamiliar color, we actually do learn information about its relative properties compared to other colors (i.e. similarities and compatibilities). For example, we are able to evaluate how red is similar to orange and how it is different. We also learn its impact on our mental states. Physicalism overestimates human cognitive abilities. We have over a million neurons in our brain, and we are nowhere near to gaining a comprehensive view of human cognitive abilities. As any cognitive science major (such as me) knows, understanding what each and every neuron in our brain does is, at a minimum, a long way off. Yet, physicalism assumes we have the power to fully articulate all elements of the world around us and the complexity of our environment. This is not supportable and is a major flaw in his argument. Both Lewis and Jackson agree that some things cannot be learned in a black and white room. The weakness of Lewis’ argument is that he fails to acknowledge the cognitive differences between us and those who do not share similar obdurate mental states. Despite this weakness, there are some strengths for Lewis’ materialistic argument. Lewis removes the inability to assure the non-physical exists. Because he emphasizes the learning of abilities rather than new experiences, his theory relies on the physical and validates that physicalism could be correct. His opponents, dualists, believe that mind and body are separate entities, which is anti-physical. The largest problem with dualis

Sample Solution

Comply today with Compliantpapers.com, at affordable rates

Order Now