Introduction to Machine Learning
Lecturer: Emi Tanaka
Department of Econometrics and Business Statistics
z_i = \beta_0 + \sum_{j=1}^p\beta_jx_{ij} = \boldsymbol{\beta}^\top\boldsymbol{x}_i, \quad\text{where }\boldsymbol{\beta} = (\beta_0, \beta_1, \dots, \beta_p)^\top.
When x_1 = 1 and x_2 = 3, then \begin{align*}z &= \beta_0 + \beta_1x_1 + \beta_2 x_2\\ &= 1 + 0.5 \times 1 - 3\times 3 = -7.5.\end{align*}
Using ReLU with a = 1, w = 2, the prediction is 1 + 2 \times \max(0, z) = 1.
Your turn!
3
-11
h(z_i) = \mathbb{I}(z_i > 0)
h(z_i) = z_i
h(z_i|a,w) = a + w(1+e^{-z_i})^{-1}
h(z_i) = a + w \left(\frac{2}{1+e^{-2z_i}} - 1 \right)
h(z_i) = a + w \times \max(0, z_i)
f(\boldsymbol{x}_i) = b + w_1{\color{#027EB6}{h(\boldsymbol{\beta}_1^\top\boldsymbol{x}_i)}} + w_2\color{#EE0220}{h(\boldsymbol{\beta}_2^\top\boldsymbol{x}_i)}
This represents a neural network with
petcat petdog petfish
1 1 0 0
2 0 1 0
3 1 0 0
4 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$pet
[1] "contr.treatment"
The Sigmoid function only works for m = 2.
For m > 2, we can use the Softmax activation function instead: P(y_{ij} = 1 | \boldsymbol{x}_i) = \frac{\exp(\boldsymbol{\beta}_j^\top\boldsymbol{x}_i)}{\sum_{j=1}^m\exp\left(\boldsymbol{\beta}_j^\top\boldsymbol{x}_i\right)}.
The number of neurons for the Softmax layer must be m.
Note that \sum_{j=1}^m P(y_{ij} = 1 | \boldsymbol{x}_i) = 1.
Layer 2
Output layer
Prediction: The customer will buy the cheap brand.
keras
package in R.keras
, run the following commands:keras
looking at the wrong location for the Keras library.keras_model_sequential()
must be used first to initialise the architechture.layer_dense()
indicates a new layer with:
units
indicating the number of neurons in that layer,input_shape
indicating the number of predictors (only needed for the first layer_dense()
),activation
specifying the activation function for that layer.get_weights()
:w <- get_weights(model)
w[[1]] <- matrix(c(49.62, -0.37, 27.62, -0.19, -1.72, 0.35), nrow = 2)
w[[3]] <- diag(3)
set_weights(model, w)
get_weights(model)
[[1]]
[,1] [,2] [,3]
[1,] 49.62 27.62 -1.72
[2,] -0.37 -0.19 0.35
[[2]]
[1] 0 0 0
[[3]]
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
[[4]]
[1] 0 0 0
ETC3250/5250 Week 11