Voted Perceptron

 

 

 

 

 

 

Problem 1. Voted Perceptron (50 points)
For this assignment, we publish a CodaLab competition.
Perceptron algorithm is one of the classic algorithms which has been used in machine learning
from early 1960s. It is an online learning algorithm for learning a linear threshold function which
works as a classifier. We will also apply the Voted Perceptron algorithm on the classification task
in this assignment.
Let D = {(xi, ti)}mi=1 be the training data, let xi be the feature vectors and ti ∈ {−1, +1} be the
corresponding labels. The goal of the Perceptron algorithm is to find a vector w which defines a
linear function such that ∀i, ti(w ⋅ xi) > 0. For the Voted Perceptron, you need a list of weighted
vectors {(wk, ck)}K
k=1 where wk is the vector and ck is its weight. (Please refer to
https://cseweb.ucsd.edu/~yfreund/papers/LargeMarginsUsingPerceptron.pdf for a clear
description of voted perceptron.)
Then you can use all these vectors and their weight to do the classification: the predicted label ŷ
for any feature vector would be K ŷ= sign(∑cksign(wkx))
k=1
Dataset
For this problem, you can download the training data from
Initiate k=1, c_1 = 0, w_1 = 0, t = 0;
while t <= T do
for each training example (x_i, t_i) do
if y_i (w_k x_i) <= 0 then
w_k+1 = w_k + t_i x_i;
c_k+1 = 1;
k = k + 1
else
c_k += 1;
end
end
t = t + 1;
end
It contains two CSV files:
Xtrain.csv Each row is a feature vector. The values in the i-th columns are integer
values in the i-th dimension.
Ytrain.csv The CSV file provides the binary labels for corresponding feature vectors in
the file Xtrain.csv .
Submission format
The final submission format should be
– submission.zip
– run.py
– other python scripts you wrote
Evaluation Criteria
The final score will be a weighted combination of accuracy and F-1 score to evaluate your results:
final_score = 50 × accuracy + 50 × F_score
The competition will cover 80% for this problem.
Note that to create a valid submission, please use the command
zip -r submission.zip run.py [other python scripts] starting from the
directory. DO NOT zip the directory itself, just its content. A sample run.py file is
#!/usr/bin/env python
# import the required packages here
def run(Xtrain_file, Ytrain_file, test_data_file, pred_file):
”’The function to run your ML algorithm on given datasets, generate the

Parameters
———-
Xtrain_file: string
the path to Xtrain csv file
Ytrain_file: string
the path to Ytrain csv file
test_data_file: string
the path to test data csv file
pred_file: string
the prediction file to be saved by your code. You have to save your
”’
## your implementation here
# read data from Xtrain_file, Ytrain_file and test_data_file
# your algorithm
# save your predictions into the file pred_file
# define other functions here
Report (20%)
In addition to the code submission on CodaLab Competitions, you are also supposed to write a
PDF report and submit it on canvas. The report should solve the following question:
First, use the last 10% of the training data as your test data. Compare Voted Perceptron on
several fractions of your remaining training data. For this purpose, pick
1%,2%,5%,10%,20% and 100% of the first 90% training data to train and compare the
performance of Voted Perceptron on the test data. Plot the accuracy as a function of the size of
the fraction you picked (x-axis should be “percent of the remaining training data” and y-axis
should be “accuracy”).
Problem 2. KNN classifier (50 points)
For this assignment, we publish another CodaLab competition.
Dataset
For this problem, you can download the training data from
The training data contains two CSV files:
Xtrain.csv Each row is a feature vector. The values in the i-th columns are float
numbers in the i-th dimension.
Ytrain.csv The CSV file provides the multi-class labels for corresponding feature
vectors in the file Xtrain.csv . Please note the labels will be integer numbers between
0 and 10.
Please note that there is neither header nor index in the CSV files. The program should use a
vote among the k nearest neighbors to determine the output label of a test point; in the case of a
tie vote, choose the label of the closest neighbor among the tied exemplars. In the case of a
distance tie (e.g., the two nearest neighbors are at the same distance but have two different
A sample training data Xtrain.csv is
3.1665,2.9837,2.9480
3.4507,3.1793,2.9028
A sample training label Ytrain.csv is
1
2
labels), choose the lowest-numbered label (e.g., choose label 3 over label 7). To determine
distance/nearness in this problem, use Euclidian distance. As with the other problems, the output
values should be in the appropriate order corresponding to the order of the testing data points.
Submission format
The final submission format should be
Note that to create a valid submission, please use the command zip -r
submission.zip run.py [other python scripts] starting from the directory. DO
NOT zip the directory itself, just its content. A sample run.py file is
– submission.zip
– run.py
– other python scripts you wrote
Evaluation Criteria
The final score will be accuracy evaluate your results:
final_score = accuracy
The competition will cover 80% for this problem.
Report (20%)
In addition to the code submission on CodaLab Competitions, you are also supposed to write a
PDF report and submit it on canvas. The report should solve the following questions:
#!/usr/bin/env python
# import the required packages here
def run(Xtrain_file, Ytrain_file, test_data_file, pred_file):
”’The function to run your ML algorithm on given datasets, generate the

Parameters
———-
Xtrain_file: string
the path to Xtrain csv file
Ytrain_file: string
the path to Ytrain csv file
test_data_file: string
the path to test data csv file
pred_file: string
the prediction file name to be saved by your code. You have to save
”’
## your implementation here
# read data from Xtrain_file, Ytrain_file and test_data_file
# your algorithm
# save your predictions into the file pred_file
# define other functions here
How will the accuracy vary across different selections of k, i.e., k = 1, 2, 3, 4…

 

 

 

Sample Solution

ow” information, Mary is able to recognize and remember the color red. If the Ability Hypothesis is true, Mary gains the ability to remember the experience of seeing red. After experiencing red for the first time, you can remember the experience, and therefore imagine the recreation of seeing red. Lewis also argues that another important ability gained is t`he ability to recognize. If Mary sees the color red again, she will recognize it immediately. Lewis uses the example of Vegemite. If you taste Vegemite at a later time, you will remember (or recognize) you have tasted it in the past. From this, you will be able to put a name to the taste experience. Lewis also argues that these abilities could originate from essentially anywhere – even magic. His main point is that experience, not lessons, is the best method of learning what a new experience is like. Overall, Lewis agrees that knowledge is gained from experiencing red, but believes the knowledge gained is “know-how” information, which is phenomenal, and therefore physicalism is valid. Lewis argues that information and ability are different physical knowledges – this is why physicalism can be true and consistent with the conclusion that Mary gains new knowledge. It is important to consider Lewis’ anti-qualia argument. Although the Ability Hypothesis may seem persuasive to David Lewis, there are several weaknesses. First, when we are shown an unfamiliar color, we actually do learn information about its relative properties compared to other colors (i.e. similarities and compatibilities). For example, we are able to evaluate how red is similar to orange and how it is different. We also learn its impact on our mental states. Physicalism overestimates human cognitive abilities. We have over a million neurons in our brain, and we are nowhere near to gaining a comprehensive view of human cognitive abilities. As any cognitive science major (such as me) knows, understanding what each and every neuron in our brain does is, at a minimum, a long way off. Yet, physicalism assumes we have the power to fully articulate all elements of the world around us and the complexity of our environment. This is not supportable and is a major flaw in his argument. Both Lewis and Jackson agree that some things cannot be learned in a black and white room. The weakness of Lewis’ argument is that he fails to acknowledge the cognitive differences between us and those who do not share similar obdurate mental states. Despite this weakness, there are some strengths for Lewis’ materialistic argument. Lewis removes the inability to assure the non-physical exists. Because he emphasizes the learning of abilities rather than new experiences, his theory relies on the physical and validates that physicalism could be correct. His opponents, dualists, believe that mind and body are separate entities, which is anti-physical. The largest problem with dualis

This question has been answered.

Get Answer
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
👋 Hi, Welcome to Compliant Papers.