A1. Explain what pruning is used for in neural networks.
A2. Explain what the parameter Vmax is used for in Particle Swarm Optimisation.
A parameter Vmax is usually set to limit velocity change. Vmax is the maximal (euclidean) length of the velocity vector. when update the position of each particle. like the parameter η in machine learning.
if Vmax too low = too slow, take a long time to search the optimal solutions
too high = too unstable, it is easily to miss the optimal solutions
A3. Explain the term stationary with regards to time series data.(平稳性检验 Sto)
Stationary means that the data has no trend or seasonality.
• For most business applications data is not stationary!
• You can convert data to stationary data using differencing.
A4. Explain how the moving average smoothing technique works, including a description of the common types of moving average smoothing methods in your answer.
Replace a data value with the average of that data value and neighboring n-1 data values. for example, if n=3, a data value will be replaced by the average of this data value and two besides this data value.
• Another common variation of this method is Weighted Moving Averages which allows you to give more importance to what happened recently without losing impact of the past.
Section B
B1. Your manager has created a function to model a business process and wants to use multi-objective optimisation to identify the minimum solution.
a. Describe swarm intelligence and the three most common types of swarm based optimisation algorithms to your manager. Explain how each of the swarm based optimisation algorithms work in your answer.
b. Solve the function below with Particle Swarm Optimisation using the psoptim() function in the pso package in R.
Use a maximum of 1,500 iterations and a swarm size of 100 particles. The function has two input parameters, set these both as NA.
Include the code used, the output of the function and explain what the output means.
library("pso")
# define the function
B1function <- function(x) {
x <- matrix(x,ncol=2)
a <- 1+(x[,1]+x[,2]+1)^2*(19-14*x[,1]+3*x[,1]^2-14*x[,2]+6*x[,1]*x[,2]+3*x[,2]^2)
b <- 30+(2*x[,1]-3*x[,2])^2*(18-32*x[,1]+12*x[,1]^2+48*x[,2]-36*x[,1]*x[,2]+27*x[,2]^2)
f.x <- a*b
return(f.x)
}
# Particle Swarm Optimisation
PSO <- psoptim(fn = B1function, par = c(NA,NA), control =list(maxit = 1500, s = 100))
PSO
output:
PSO finds the optimal value "3" after reaching the 1500 iterations.
B2. Audi has approached your consultancy company asking you to help them forecast the number of A4 cars sold in the UK. They have provided you with the quarterly time series sales data from Q3 2009 to Q3 2015 (B2.csv) for the Audi A4 car.
a. Using the read.csv(), ts() and plot() functions in R, import the data, create a time series object then plot the time series object.
From looking at this plot, would it be better to use additive or multiplicative decomposition? Include the plot in your answer.
CODE:
audi <- read.csv("2016B2.csv")
View(data)
auditime <- ts(data = audi$Sales, start = c(2009,3), end = c(2015,3), frequency = 4)
plot(auditime)
We choose the additive method because the seasonal fluctuation doesn't vary over time.
b. Using the decompose() function in R, decompose the data with the method you identified in answer B2.a and explain what is shown in the plot.
> plot( decompose(auditime,type = c("additive")))
c. Using the ets() function in the forecast package in R, predict future sales using exponential smoothing for the next year. Include an image of the forecast in your answer. Also state which variation of exponential smoothing you used and why.
library("forecast")
fit <- ets(audi$Sales[1:25], model = "ZZZ")
predicting <- predict(fit, h=4)
plot(predicting)
B3. A wine company has approached your consultancy company asking you to help them with understanding the differences between 100 different batches of wine they have produced. They have provided you with a dataset (B3.csv) with 14 columns, the first column contains the batch numbers and the remaining columns contain numerical measurements for each of the wine batches.
a. Using read.csv() and the corrgram() function from the corrgram package, import the data and create a correlogram plot of the 13 numerical measurements of the wine batches.
Describe the output of this plot and include an image of the plot in your answer.
Code:
# set first column as Index, delete first column
data <- read.csv("2016B3.csv", row.names = "Batch")
View(data)
# see the correlation plot
library("corrgram")
corrgram(data)
This plot is a graphical display of a correlation matrix. The dark the color is, the higher correlation of two variables. The blue means positive correlated while the red means negative correlated. For example, total phenols and flavanoids are highly positive correlated.
b. Using the plot() and prcomp() functions in R, plot a scree plot and describe how you can use this plot to identify the number of components to use in principal components analysis. Include the scree plot in your answer.
pca <- prcomp(data, scale. = TRUE)
plot(pca, type = "line")
A scree plot displays how much variation each principal component captures from the data. we can use the following rules
• Kaisers rule states to use components with values over 1.
• use the "elbow rule"
So, we use 2 components here.
c. Using the biplot() and prcomp() functions in R, create a biplot of the results of the principal components analysis. Explain the plot and describe how the results of the principal components analysis could possibly be made easier to interpret. Include the biplot in your answer.
biplot(pca, xlabs = rep(".",nrow(data)), cex = 0.8)
The plot shows the original variables and how they are placed on the 2D principal component space.
You can see that Proanthocyanines and Color.intensity load highly on the first component and hardly at all on the second component (this can also be seen in the PC1 and PC2 values shown on the previous page). You can also see that Ash loads highly on the second component and not much on the first component.
To make the components easier to interpret we can use rotation. The psych library contains functions for this
CODE:
library(psych)
library(GPArotation)
pca1 <- principal(data, nfactors = 2, rotate="varimax")
biplot(pca1, lab = row.names(data))