Introduction to Machine Learning
Lecturer: Emi Tanaka
Department of Econometrics and Business Statistics
Predict the Toyota car price
from this used car listing data.
Predict the insurance charges
given customer characteristics from this data.
Diagnose (diagnosis
) a breast mass sample as malignant (M
) or benign (B
) from the features of its image using Wisconsin breast cancer data set.
scroll
Predict whether the titanic passenger survived from class, sex and age.
Predict digit 0-9 (label
) from a 28\times 28 (784 pixels) image.
Predict whether client will subscribe to a term deposit (y
) based on direct marketing campaigns of a Portuguese banking institution.
Predict whether the customer will Purchase
a caravan insurance policy based on data here.
Learn about customer personalities from a customer survey data.
marketing <- read_tsv("https://emitanaka.org/iml/data/marketing_campaign.csv")
marketing_clean <- marketing %>%
mutate(Marital_Status = fct_collapse(
Marital_Status,
"1" = c("Absurd", "Alone", "Divorced", "Single",
"Widow", "YOLO"),
"2" = c("Married", "Together")
),
Marital_Status = as.numeric(as.character(Marital_Status)),
Education = fct_collapse(
Education,
"3" = c("2n Cycle", "Master", "PhD"),
"1" = "Basic",
"2" = "Graduation"
),
Education = as.numeric(as.character(Education)),
Income = case_when(is.na(Income) ~ 0,
Income == 666666 ~ 0,
TRUE ~ Income),
Dt_Customer = year(dmy(Dt_Customer)) - 2011) %>%
filter(Year_Birth >= 1940) %>%
mutate(Age = 2015 - Year_Birth) %>%
select(-Year_Birth)
Profile the faces from the yale face database (Belhumeur et al. 1997).
Investigate wine quality based on several features of wines based on this data.
Predict the clothe type (labelled 0-9) from a 28\times 28 (784 pixel) image using the fashion MNIST data by Zalando SE.
We cover the following methods:
We cover the following methods:
If you liked this unit (and did well), you may like to consider enrolling for the next semester:
ETC3555/ETC5555 - Statistical Machine Learning
This unit covers the methods and practice of statistical machine learning for modern data analysis problems. Topics covered will include recommender systems, social networks, text mining, matrix decomposition and completion, and sparse multivariate methods. All computing will be conducted using the R programming language.
Prerequisites: ETC3250/ETC5250 or FIT3154
scroll
Good luck!
ETC3250/5250 Wrap-up