Introduction to Machine Learning
Lecturer: Emi Tanaka
Department of Econometrics and Business Statistics
zi=β0+j=1∑pβjxij=β⊤xi,where β=(β0,β1,…,βp)⊤.
When x1=1 and x2=3, then z=β0+β1x1+β2x2=1+0.5×1−3×3=−7.5.
Using ReLU with a=1,w=2, the prediction is 1+2×max(0,z)=1.
Your turn!
3
−11
h(zi)=I(zi>0)
h(zi)=zi
h(zi∣a,w)=a+w(1+e−zi)−1
h(zi)=a+w(1+e−2zi2−1)
h(zi)=a+w×max(0,zi)
f(xi)=b+w1h(β1⊤xi)+w2h(β2⊤xi)
This represents a neural network with
petcat petdog petfish
1 1 0 0
2 0 1 0
3 1 0 0
4 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$pet
[1] "contr.treatment"
The Sigmoid function only works for m=2.
For m>2, we can use the Softmax activation function instead: P(yij=1∣xi)=∑j=1mexp(βj⊤xi)exp(βj⊤xi).
The number of neurons for the Softmax layer must be m.
Note that ∑j=1mP(yij=1∣xi)=1.
Layer 2
Output layer
Prediction: The customer will buy the cheap brand.
keras
package in R.keras
, run the following commands:keras
looking at the wrong location for the Keras library.keras_model_sequential()
must be used first to initialise the architechture.layer_dense()
indicates a new layer with:
units
indicating the number of neurons in that layer,input_shape
indicating the number of predictors (only needed for the first layer_dense()
),activation
specifying the activation function for that layer.get_weights()
:w <- get_weights(model)
w[[1]] <- matrix(c(49.62, -0.37, 27.62, -0.19, -1.72, 0.35), nrow = 2)
w[[3]] <- diag(3)
set_weights(model, w)
get_weights(model)
[[1]]
[,1] [,2] [,3]
[1,] 49.62 27.62 -1.72
[2,] -0.37 -0.19 0.35
[[2]]
[1] 0 0 0
[[3]]
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
[[4]]
[1] 0 0 0
ETC3250/5250 Week 11