1 answer

Matlab question Task 2 It turns out that some values in lab4_plot_data.txt vary too much from...

Question:

Task 2 It turns out that some values in lab4_plot_data.txt vary too much from the sin(x) curve and should be removed as they
-3. lab4_plot_data.txt -5 -4.898989899 -4.797979798 -4.696969697 -4.595959596 -4. 494949495 -4.393939394 -4.292929293 -4.1919

Matlab question
Task 2 It turns out that some values in lab4_plot_data.txt vary too much from the sin(x) curve and should be removed as they are considered noisy (invalid). You are tasked with the following: a) Remove the noisy data using the following rules. Do not set the invalid data points to 0, 1) or NaNl Use logical statements in addition to true (1) and false (O) values to complete the task. a. An absolute difference of 0.4 or greater between the data point and the sin(x) value is deemed noisy b. The 11"), 591 and 88"data points are also deemed noisy. b) Plot the valid (noiseless) data points as blue circles on the same figure produced in Task 1. Therefore, the red circles now represent the noisy data while the blue circles represent the valid data. Remember to include a legend.
-3. lab4_plot_data.txt -5 -4.898989899 -4.797979798 -4.696969697 -4.595959596 -4. 494949495 -4.393939394 -4.292929293 -4.191919192 -4.09090909 1 -3.98989899 -3.BBBBBBBB9 -3.787878788 -3.686868687 585858586 -3.484848485 -3.383838384 -3.282828283 -3.18181818 2 -3.BOBOB0B1 -2.97979798 -2.878787879 -2.777777778 -2. 676767677 -2.575757576 -2.474747475 -2.373737374 -2.27272727 3 -2.171717172 -2.070707071 -1.96969697 -1.868686869 -1. 767676768 -1.666666667 -1.565656566 -1.464646465 -1.36363636 4 -1.262626263 -1.161616162 -1.060606061 [email protected] [email protected] 858585859 -0.757575758 [email protected] -0.555555556 -0.45454545 5 [email protected] -0.252525253 [email protected] 151515152 [email protected]@50505051 @[email protected] @.151515152 @.252525253 @.353535354 2.454545455 6.555555556 @.656565657 @.757575758 0.858585859 @.95959596 1.060606061 1.161616162 1.262626263 1.363636364 1.464646465 1.565656566 1.666666667 1.767676768 1.868686869 1.96969697 2.070707071 2. 171717172 2.272727273 2.373737374 2.474747475 2.575757576 2.676767677 2.777777778 2.878787879 2.97979798 3.8888881 3.181818182 3.282828283 3.383838384 3.484848485 3.585858586 3.686868687 3.787878788 3.888888889 3.98989899 4.e9e9e9091 4.191919192 4.292929293 4.393939394 4.494949495 4.595959596 4.696969697 4.797979798 4.898989899 5 1.63053469 1.286721002 8.833251255 @.559457524 1.059927259 @.925249135 1.429288437 @.989760125 e.791316615 @.896793992 @.705376837 0.452445676 @.936673537 e. 183883273 @.824912755 -0.01980069 0.476182141 -0.213527529 -0.43881506 @. 192140632 [email protected] @.135115548 [email protected] [email protected] [email protected] -0.30881668 3 -1.602258621 -0.702284862 -0.886712081 -0.735395854 -1. 289354951 -8.758383699 [email protected] -1.462965016 -0.88045346 2 [email protected] -1.264212055 -1.326725201 [email protected] 119118768 -0.929786153 -1.226556038 [email protected] -0.34040209 3 [email protected] -0.391881467 -0.376299381 @. 159984521 -0.536675251 [email protected] [email protected] @.493886171 @.578273157 -0.148973928 @.455899865 @.315288751 @.855391973 8.978914479 @.79855404 1.149037091 e. 898884219 @.639032841 1.308902837 1.222574316 1.303171714 1. 106875579 @.672578627 1.171002714 1.007814171 2.732732788 e 1.089855754 @.704172511 @.452134787 @.956057802 @.41189121 @.645016944 2.877543164 10 @.786346118 @.401759549 -6.146686724 @.4504919 @.437268975 @.319035075 -0.047475536 -0.255233261 -0.031419477 -0.46174552 5 -0.68163784 -0.501743898 -0.326282256 -1.104408296 -1. 160674488 -1.37622812 -0.662182851 -6.971464332 -1.39791839 5 [email protected] -1.236296182 -1.116860587 -1.887972467 -1. I L ng

Answers

Here is the following code please do upvote thankyou

Back-propagation is that the essence of neural net training. it's the practice of fine-tuning the weights of a neural net supported the error rate (i.e. loss) obtained within the previous epoch (i.e. iteration). Proper tuning of the weights ensures lower error rates, making the model reliable by increasing its generalization.

So how does this process work, with the vast simultaneous mini-executions involved? Let’s learn by example!

In order to form this instance as subjective as possible, we’re just getting to touch on related concepts (e.g.

Loss functions, optimization functions, etc.) without explaining them, as these topics deserve their own series.

First off, let’s set the model components

Imagine that we've a deep neural network that we'd like to coach . the aim of coaching is to create a model that performs the XOR (exclusive OR) functionality with two inputs and three hidden units, such the training set (truth table) looks something just like the following:

X1 | X2 | Y

0 | 0 | 0

0 | 1 | 1

1 | 0 | 1

1 | 1 | 0

Moreover, we'd like an activation function that determines the activation value at every node within the neural net. For simplicity, let’s choose an identity activation function:

f(a) = a

We also need a hypothesis function that determines what the input to the activation function is. This function goes to be the standard , ever-famous:

h(X) = W0.X0 + W1.X1 + W2.X2

or

h(X) = sigma(W.X) for all (W, X)

Let’s also choose the loss function to be the standard cost function of logistic regression, which looks a touch complicated but is really fairly simple:

Furthermore, we’re getting to use the Batch Gradient Descent optimization function to work out in what direction we should always adjust the weights to urge a lower loss than the one we currently have. Finally, the training rate are going to be 0.1 and every one the weights are going to be initialized to 1.

Our Neural Network

Let’s finally draw a diagram of our long-awaited neural net.

It should look something like this:

The leftmost layer is that the input layer, which takes X0 because the bias term useful 1, and X1 and X2 as input features. The layer within the middle is that the first hidden layer, which also takes a bias term Z0 useful 1. Finally, the output layer has just one output unit D0 whose activation value is that the actual output of the model (i.e. h(x)).

Now we forward-propagate

It is now the time to feed-forward the knowledge from one layer to subsequent . This goes through two steps that happen at every node/unit within the network:

1- Getting the weighted sum of inputs of a specific unit using the h(x) function we defined earlier.

2- Plugging the worth we get from step 1 into the activation function we've (f(a)=a during this example) and using the activation value we get (i.e.

The output of the activation function) because the input feature for the connected nodes within the next layer.

Note that units X0, X1, X2 and Z0 don't have any units connected to them and providing inputs. Therefore, the steps mentioned above don't occur in those nodes. However, for the remainder of the nodes/units, this is often how it all happens throughout the neural net for the primary input sample within the training set:

Unit Z1:

h(x) = W0.X0 + W1.X1 + W2.X2

= 1 . 1 + 1 . 0 + 1 .

0

= 1 = a

z = f(a) = a => z = f(1) = 1

and same goes for the remainder of the units:

Unit Z2:

h(x) = W0.X0 + W1.X1 + W2.X2

= 1 . 1 + 1 . 0 + 1 . 0

= 1 = a

z = f(a) = a => z = f(1) = 1

Unit Z3:

h(x) = W0.X0 + W1.X1 + W2.X2

= 1 . 1 + 1 .

0 + 1 . 0

= 1 = a

z = f(a) = a => z = f(1) = 1

Unit D0:

h(x) = W0.Z0 + W1.Z1 + W2.Z2 + W3.Z3

= 1 . 1 + 1 . 1 + 1 . 1 + 1 .

1

= 4 = a

z = f(a) = a => z = f(4) = 4

As we mentioned earlier, the activation value (z) of the ultimate unit (D0) is that of the entire model. Therefore, our model predicted an output of 1 for the set of inputs {0, 0}. Calculating the loss/cost of the present iteration would follow:

Loss = actual_y - predicted_y

= 0 - 4

= -4

The actual_y value comes from the training set, while the predicted_y value is what our model yielded. therefore the cost at this iteration is adequate to -4.

So where is Back-propagation?

According to our example, we now have a model that doesn't give accurate predictions (it gave us the worth 4 rather than 1) which is attributed to the very fact that its weights haven't been tuned yet (they are all adequate to 1). We even have the loss, that's adequate to -4.

Back-propagation is all about feeding this loss backwards in such how that we will fine-tune the weights supported which. The optimization function (Gradient Descent in our example) will help us find the weights which will — hopefully — yield a smaller loss within the next iteration. So let’s get to it!

If feeding forward happened using the subsequent functions:

f(a) = a

Then feeding backward will happen through the partial derivatives of these functions. there's no got to undergo the working of arriving at these derivatives. All we'd like to understand is that the above functions will follow:

f'(a) = 1

J'(w) = Z .

Delta

where Z is simply the z value we obtained from the activation function calculations within the feed-forward step, while delta is that the loss of the unit within the layer.

I know it’s tons of data to soak up in one sitting, but I suggest you're taking some time and really understand what's happening at every step before going further.

Calculating the deltas

Now we'd like to seek out the loss at every unit/node within the neural net. Why is that? Well, believe it this manner , every loss the the deep learning model arrives to is really the mess that was caused by all the nodes accumulated into one number. Therefore, we'd like to seek out out which node is liable for most of the loss in every layer, in order that we will penalize it during a sense by giving it a smaller weight value and thus lessening the entire loss of the model.

Calculating the delta of each unit are often problematic. However, because of Mr. Andrew Ng, he gave us the shortcut formula for the entire thing:

delta_0 = w .

Delta_1 . f'(z)

where values delta_0, w and f’(z) are those of an equivalent unit’s, while delta_1 is that the loss of the unit on the opposite side of the weighted link. For example:

You can consider it this manner , so as to urge the loss of a node (e.g. Z0), we multiply the worth of its corresponding f’(z) by the loss of the node it's connected to within the next layer (delta_1), by the load of the link connecting both nodes.

This is exactly how back-propagation works. We do the delta calculation step at every unit, back-propagating the loss into the neural net, and checking out what loss every node/unit is liable for .

Let’s calculate those deltas and obtain it over with!

delta_D0 = total_loss = -4

delta_Z0 = W .

Delta_D0 . f'(Z0) = 1 . (-4) . 1 = -4

delta_Z1 = W . delta_D0 .

F'(Z1) = 1 . (-4) . 1 = -4

delta_Z2 = W . delta_D0 . f'(Z2) = 1 .

(-4) . 1 = -4

delta_Z3 = W . delta_D0 . f'(Z3) = 1 . (-4) .

1 = -4

There are a couple of things to note here:

The loss of the ultimate unit (i.e. D0) is adequate to the loss of the entire model. this is often because it's the output unit, and its loss is that the accumulated loss of all the units together, like we said earlier.

The function f’(z) will always give the worth 1, regardless of what the input (i.e. z) is adequate to . this is often because the partial , as we said earlier, follows: f’(a) = 1

The input nodes/units (X0, X1 and X2) don't have delta values, as there's nothing those nodes control within the neural net.

They're only there as a link between the info set and therefore the neural net. this is often merely why the entire layer is typically not included within the layer count.

Updating the weights

All that's left now's to update all the weights we've within the neural net. This follows the Batch Gradient Descent formula:

W := W - alpha . J'(W)

Where W is that the weight at hand, alpha is that the learning rate (i.e. 0.1 in our example) and J’(W) is that the partial of the value function J(W) with reference to W.

Again, there’s no need for us to urge into the maths . Therefore, let’s use Mr. Andrew Ng’s partial of the function:

J'(W) = Z . delta

Where Z is that the Z value obtained through forward-propagation, and delta is that the loss at the unit on the opposite end of the weighted link:

Now we use the Batch Gradient Descent weight update on all the weights, utilizing our partial values that we obtain at every step. it's worth emphasizing thereon the Z values of the input nodes (X0, X1, and X2) are adequate to 1, 0, 0, respectively.

The 1 is that the value of the bias unit, while the zeroes are literally the feature input values coming from the info set. One last note is that there's no particular order to updating the weights. you'll update them in any order you would like , as long as you don’t make the error of updating any weight twice within the same iteration.

In order to calculate the new weights, let’s give the links in our neural nets names:

New weight calculations will happen as follows:

W10 := W10 - alpha . Z_X0 . delta_Z1

= 1 - 0.1 .

1 . (-4) = 1.4

W20 := W20 - alpha . Z_X0 . delta_Z2

= 1 - 0.1 . 1 .

(-4) = 1.4

. . . . .

.

. . . .

. .

. . .

W30 := 1.4

W11 := 1.4

W21 := 1.4

W31 := 1.4

W12 := 1.4

W22 := 1.4

W32 := 1.4

V00 := V00 - alpha . Z_Z0 . delta_D0

= 1 - 0.1 .

1 . (-4) = 1.4

V01 := 1.4

V02 := 1.4

V03 := 1.4

It is important to notice here that the model isn't trained properly yet, as we only back-propagated through one sample from the training set. Doing all we did everywhere again for all the samples will yield a model with better accuracy as we go, trying to urge closer to the minimum loss/cost at every step.

.

Similar Solved Questions

1 answer
How do you solve and write the following in interval notation: #| 4x | > -12#?
How do you solve and write the following in interval notation: #| 4x | > -12#?...
1 answer
How do you find the slope and intercept to graph #y=3x+1#?
How do you find the slope and intercept to graph #y=3x+1#?...
1 answer
A current / flows in the closed loop shown below as indicated. In general terms, find...
A current / flows in the closed loop shown below as indicated. In general terms, find the magnitude and direction of the magnetic field at the center of the loop via Bio-Savart law. The circular arcs have the same center and all linear segments are radial....
1 answer
1. The following are the values of waiting time of 6 customers of a city bank...
1. The following are the values of waiting time of 6 customers of a city bank in minutes 5, 5, 6, 6, 8, 9 2, 3, 5, 7, 10, 11 Find mean and standard deviations using the formula: 2.2 12 n(n-1)...
1 answer
Check my wor Required information [The following information applies to the questions displayed below.] Part 1...
Check my wor Required information [The following information applies to the questions displayed below.] Part 1 of 4 IO Bedrock Inc. is owned equally by Barney Rubble and his wife Betty, each of whom hold 1,100 shares in the company. Betty wants to reduce her ownership in the company, and it was deci...
1 answer
Articulate and evaluate your own opinion about the degree of distance prevalent in U.S. comp between...
Articulate and evaluate your own opinion about the degree of distance prevalent in U.S. comp between more and their direct reports, Who is protected by this management style? What ad verse organizational impacts might result from this style...
1 answer
Rockyford Company must replace some machinery that has zero book value and a current market value...
Rockyford Company must replace some machinery that has zero book value and a current market value of $4,200. One possibility is to invest in new machinery costing $47,000. This new machinery would produce estimated annual pretax cash operating savings of $18,800. Assume the new machine will have a u...
1 answer
Riverbend Inc. received a $282,500 dividend from stock it held in Hobble Corporation. Riverbend's taxable income...
Riverbend Inc. received a $282,500 dividend from stock it held in Hobble Corporation. Riverbend's taxable income is $2,330,000 before deducting the dividends received deduction (DRD), a $69,000 NOL carryover, and a $101,000 charitable contribution. Use Exhibit 16-6. (Round your tax rates to 1 de...
1 answer
3. (20) Using the procedure demonstrated in class and in the textbook, convert this NFA to...
3. (20) Using the procedure demonstrated in class and in the textbook, convert this NFA to a DFA a, b b, c 91 q2 q3 E, C b, a...
1 answer
Name Section QUESTIONS: EXPERIMENT 23 Date 1. A unknown metal, M, is the anode in an...
Name Section QUESTIONS: EXPERIMENT 23 Date 1. A unknown metal, M, is the anode in an electrolysis cell undergoing the following reaction: M(s) 2 HoM2 (aq)H2(8) 2O (aq) 43.0 mL of H2 gas is collected by water displacement in a similar manner to that done in your experiment. Atmospheric pressure in th...
1 answer
O GASES Calculating mole fraction in a gas mixture A 8.00 L tank at 19.8 °C...
O GASES Calculating mole fraction in a gas mixture A 8.00 L tank at 19.8 °C is filled with 11.0 g of chlorine pentafluoride gas and 7.97 g of boron trifluoride gas. You can assume both gases behave as ideal gases under these conditions. Calculate the mole fraction of each gas. Be sure each of yo...
1 answer
CAM PROFILE WITH KNIFE EDGE FOLLOWER
A disc cam is to give SHM (Simple Harmonic Motion) to a knife-edge follower during outstroke of 50mm. The angle ascent is 120, dwell 60, and angle of descent 90°. The minimum radius of the cam is 50mm. Draw the profile of the cam when the axis of the follower passes through the axis of the camsh...
1 answer
Practice, Question 12 Mackay Corporation moved into its new premises on November 1, 2018. At that...
Practice, Question 12 Mackay Corporation moved into its new premises on November 1, 2018. At that time. It paid the landlord $6,000 for three month's rent. On November 1, 2018, the for $6,000 and credited Cash for $6,000. Assuming no further entries were made, what entry would be required to pre...
1 answer
ERISA requires employers to distribute new summary plan descriptions to the Department of Labor within how...
ERISA requires employers to distribute new summary plan descriptions to the Department of Labor within how many days. A) 90 B) 120 C) 180 D) 210...
1 answer
QUESTION 22 Find an integrating factor of the form X"y" and solve the equation. (2x+4y2-9y)dx+ (3y-6x)...
QUESTION 22 Find an integrating factor of the form X"y" and solve the equation. (2x+4y2-9y)dx+ (3y-6x) dy=0, y (1) =1 O A *?y2 – 3x3y2 = -2 08.x2y4 – 3x4y2 = -2 oc 3x²y3 – x3y2=2 00.x2y3–3x3y2=-2 o e 4x2y3 – 3x3y2 = 1...
1 answer
(5 Points) You are designing eyeglasses for someone whose near point is 65 cm. What focal...
(5 Points) You are designing eyeglasses for someone whose near point is 65 cm. What focal length lens should you prescribe so that an object can be clearly seen when placed at 25 cm in front of the eye? 65 cm -18 cm 18 cm -41 cm 41 cm...
1 answer
Please show work so i understand, thank you Balance the following redox equations by the ion-electron...
please show work so i understand, thank you Balance the following redox equations by the ion-electron half-reaction method: (a) (acid solution): U4+ + MnO4 → UO2+ + Mn2 (b) (acid solution): Zn (s) + NO, Zn2+ + N,O (g) (c) (basic solution): HPO; + MnO4 → P04 + MnO42...
1 answer
Calculator Abbey Co. sold merchandise to Gomez Co. on account, $34,000, terms 2/15. net 45. The...
Calculator Abbey Co. sold merchandise to Gomez Co. on account, $34,000, terms 2/15. net 45. The cost of the merchandise sold was $13.700. Abbey Co, issued a dedi memo for $4,000 for merchandise returned that originally cost $1,100. Gomez Co, paid the invoice within the discount period. What is the a...
1 answer
1. A machine that makes airplane parts (i.e. a welder) has a monthly peak demand of...
1. A machine that makes airplane parts (i.e. a welder) has a monthly peak demand of 5000 kW and a monthly reactive requirement of 6000 KVAR. Using the rate schedule below, what is the annual cost to this utility customer associated with power factror (PF) penalties? If compensation is available thro...