**100% money back guarantee**read our guarantees

- September 13, 2020
- By menge

.QUESTIONS

1) I- A pattern should exist

II- The pattern should not be expressed mathematically

III- Data should exist

Which of the statements above can be count as one of the basis for deciding to solve a

problem using machine learning?

a)Only II b) I and II c) I and III d) II and III e) All

****the rest of question in the attachment see it below*****

1) 2) QUESTIONS

I- A pattern should exist

II- The pattern should not be expressed mathematically

III- Data should exist

Which of the statements above can be count as one of the basis for deciding to solve a

problem using machine learning?

a)Only II

b) I and II

c) I and III

d) II and III

e) All

I- A known function is tried to be learned

II- Parameters of a problem are adjusted according to past data

III- An hypothesis is a function supposed to converge a known function

Which of the statements given above is true?

a) Only I 3) b) Only II c) I and III d)II and III e) All I- Can not be known unless learning model is known

II- Consist of parameters of the target function to be learned

III- Finite for most of the learning models

IV- Consist of the probabilities to be chosen of training

samples

V- Consist of candidate solutions for the problem

Which of the statement given above is true for a hypothesis set?

a) I and II

b) I and III

c) I and V

d)Only V

e) I, III and V 4) I-Target function II- Training samples III- Learning Algorithm IV- Hypothesis set

Which of the given concepts in the options above can be controlled by the designer

when applying machine learning for solving a problem?

a) I and II

b) II and III

c) III and IV

d) I and IV

e) Only III

5) I- Target function

II- Training samples

III-Hypothesis set

IV- Learning algorithm

If we know the learning model, which of the above options can be known?

a) I and II

b) II and III

c) III and IV

d) I and IV

e) Only III 6) I- Determining at what age a medical operation is suitable to take.

II- Classifying the given numbers as prime or not prime

III- Determining possible credit card fraud on credit card receipts

IV- Determining how long it takes for a falling object to crash on the ground

V- Determining the best cycle of the traffic lights in a congested crossroad.

Which of the given problems are suitable for solving with machine learning?

a) I, II and V

b)II,III and IV c) I, III and V d) III and IV

e)III and V 7) Which is the updated weight vector W of a perceptron model having initial weight vector W

as [0.4, -0.2, 0.2] and bias as 0.1 when X pattern [0.5, 1, 0.5] is shown to it and desired output

Y is 0?

a) W= [ 1.3, -0.2, 1.6]

b) W=[0.9, 0.8, 0.7] c) W= [-0.1, -1.2, -0.3]

d) W=[0.2, -0.5, 0.4]

e) W=[0.6, -1.4, 0.9] 8) Find the in sample error Ein of a linear regression model that uses mean squared error as

error measurement in the case that model takes input matrix X including 3 training samples

and gives output vector y using weight vector w. (Put true option in a circle)

1 3

3.2

0.4

= 5 7 = 8.4

=

0.8

9 8

9.3

a) 0.56 9) b) 1.18 c) 1.29 d)0.43 e)2.13 When a linear regression model takes input matrix X including 2 training samples it gives

output vector y. In this case find the optimal weight vector w of this linear regression model

that will give minimum in sample error Ein. (Put true option in a circle)

2 4

9

=

=

6 8

10

a) 2.5

?4.5 b) ?0.25

3.75 c) ?4

4.25 d) 0.7

?0.75 e) 0.25

?2.75 10) What is the maximum number of dichotomies for a hypothesis set that classifies the 8 points

of X space as +1 or -1. (Put true option in a circle)

a) 64

b) 65

c)127

d)128

e)256 11) One dimensional hypothesis in an hypothesis set H classifies all the points as +1 that are in

the range specified by two points and -1 for the points out of this range. What is the

maximum number of dichotomies of this hypothesis for 20 points?

a)211 12) b)266 c) 331 d) 388 e) 412 Which of the following procedures is sufficient and necessary and most efficient

for proving that the VC dimension of a learner is N?

a) Show that the classifier can shatter all possible dichotomies with N points.

b) Show that the classifier can shatter a subset of all possible dichotomies with N points.

c) Show that the classifier can shatter all possible dichotomies with N points and that it

cannot shatter any of the dichotomies with N+1 points.

d) Show that the classifier can shatter all possible dichotomies with N points and that it

cannot shatter one of the dichotomies with N+1 points.

e) Show that the classifier can shatter a subset of all possible dichotomies with N points

and that it cannot shatter one of the dichotomies with N+1 points. 13) What is the maximum number of dichotomies for a machine learning model on 6 points that

has VC dimension as 4?

a) 24 14) b)36 c) 57 d) 86 e)112 A neuron with 4 inputs has the weight vector w = [1, 2, 3, 4]T and a bias = 0 (zero). The

activation function is linear, where the constant of proportionality equals 2, that is, the

activation function is given by f(net) = 2 × net. If the input vector is x = [4, 8, 5, 6]T then the

output of the neuron will be

a) 1 b)56 c)59 d)112 e)118 15) A perceptron with signal function has two inputs with weights w1 = 0.5 and w2 = ?0.2, and a

bias = 0.3. For a given training example x = [0, 1]T , the desired output is 1. Does the

perceptron give the correct answer?

a) Yes b)No 16) VC (Vapnik?Chervonenkis) bound, where it builds a bridge between what we learn in the

training set and how it performs in the test set. The simplified VC bound is as below (in the

big O notation): According to VC bound specify the following statements as TRUE of FALSE putting one option

in a circle.

( TRUE / FALSE ) I- If the complexity of the model increases, in sample error increases.

( TRUE / FALSE ) II- If the complexity of the model increases, upper bound of the

generalization gap between in sample error and out of sample error

increases.

( TRUE / FALSE ) III- If number of samples in training set increases, upper bound of the

generalization gap between in sample error and out of sample error

increases.

( TRUE / FALSE ) IV- If dimensionality of feature vectors increases, complexity of the model

decreases and this leads increase of in sample error

( TRUE / FALSE ) V- If we get high in sample error we should increase complexity of the

learning model.

( TRUE / FALSE ) VI- If we get small in sample error but high , upper bound of the

generalization gap between in sample error and out of sample error then

we should either increase complexity of the learning model or increase

the number of samples in tarining set.

( TRUE / FALSE ) VII- When the learning model and feature dimensionality are fixed,

increasing the size of the training set generally improves out of sample

error.

( TRUE / FALSE ) VIII- When the learning model is fixed for having same , upper bound of the

generalization gap between in sample error and out of sample error,

we should increase number of samples in training set.

17) In the backpropagation algorithm, how is the error function usually defined?

1

2

1

b) 2

1

c) 2

1

d) 2

1

e) 2 a) ? ? ? ? 2 2 Answer the questions numbered between 18 and 23 according to the feed forward network

given below. 18) A training pattern, consisting of an input vector x = [x1, x2, x3]T and desired outputs

t=[t1,t2,t3]T , is presented to the following neural network. What is the usual sequence of

events for training the network using the backpropagation algorithm?

a)(1) calculate yj = f(Hj ), (2) calculate zk = f(Ik), (3) update vji, (4) update wkj .

b)(1) calculate yj = f(Hj ), (2) calculate zk = f(Ik), (3) update wkj , (4) update vji.

c) (1) calculate yj = f(Hj ), (2) update vji, (3) calculate zk = f(Ik), (4) update wkj .

d)(1) calculate zk = f(Ik), (2) update wkj , (3) calculate yj = f(Hj ), (4) update vji.

e) (1) calculate zk = f(Ik), (2) calculate yj = f(Hj ), (3) update wkj , (4) update vji. 19) For the same neural network, the input vector to the network is x=[x1,x2,x3]T , the vector of

hidden layer outputs is y = [y1, y2]T , the vector of actual outputs is z=[z1,z2, z3]T , and the

vector of desired outputs is t = [t1, t2, t3]T . The network has the following weight vectors: Assume that all units have sigmoid activation functions given by and that each unit has a bias b = 0 (zero). If the network is tested with an input vector

x=[1.0,2.0, 3.0]T then the output y1 of the first hidden neuron will be

a) -2.300

20) b) 0.091 c) 0.644 d) 0.993 e) 4.900 Assuming exactly the same neural network and the same input vector as in the previous

question, what is the activation I2 of the second output neuron?

a) 0.353 b) 0.387 c) 0.596 d) 0.662 e) 0.674 21) For the hidden units of the network, the generalized Delta rule can be written as where ?vji is the change to the weights from unit i to unit j, ? is the learning rate, ?j is the

error term for unit j, and xi is the ith input to unit j. In the backpropagation algorithm, what is

the error term ?j?

a) = ? ( ? )

b) = ? ( ? )

c) = ? ?

d) = where f' (net) is the derivative of the activation function f(net).

22) For the hidden units of the network, the generalized Delta rule can be written as where ?Wkj is the change to the weights from unit j to unit k, ? is the learning rate, ?k is the

error term for unit k, and yj is the jth input to unit k. In the backpropagation algorithm, what is

the error term ?k?

a) = ? ( ? )

b) = ? ( ? )

c) = ? ?

d) = where f' (net) is the derivative of the activation function f(net).

23) The following figure shows part of the neural network. A new input pattern is presented to

the network and training proceeds as follows. The actual outputs of the network are given by

z=[0.32, 0.05, 0.67]T and the corresponding target outputs are given by t = [1.00, 1.00, 1.00]T

The weights w12, w22 and w32 are also shown below. For the output units, the derivative of the sigmoid function can be rewritten as ? = . (1 ? What is the error for each of the output units?

a)

b)

c)

d) output 1 = ?0.2304, output 2 = 0.3402, and output 3 = ?0.8476. output 1 = 0.1084, output 2 = 0.1475, and output 3 = 0.1054. output 1 = 0.1480, output 2 = 0.0451, and output 3 = 0.0730. output 1 = 0.4225, output 2 = ?0.1056, and output 3 = 0.1849.