STAT 603: Homework 7Due: Thursday, May 2nd.Directions:0. You may work in groups to discuss about ideas, but the programming and writing must be your ownwork. Copying others’ work/code or allowing others to copy your own work/code are allconsidered cheating and plagiarism, and will result in zero point for the whole homeworkand F grade for STAT603. Cheating in any coursework is considered serious offense against academicintegrity and University rules.1. Submit a PDF copy of your homework, R source code, and your label prediction onCanvas. For the PDF file, you should name it as “myhomework.pdf”; for the R source code, youshould name it as “mycode.R”; for your prediction for the testing data, you should name it as“myprediction.txt” (See Q8). Only file types of “pdf”, “R” and “txt” will be accepted on Canvas. Ifany of these three files are missing online, we won’t grade your homework.2. Submit a hardcopy of the PDF file “myhomework.pdf” in class. We won’t grade yourhomework without a hardcopy.3. Show all your work! Both source code and key outputs from running your code are required. Simplygiving a final answer or source code without appropriate explanation/key outputs will not receive anypoints.4. Typing answers in RMarkdown or LaTeX is strongly recommended.In this homework, we continue to work on the MNIST data sets. Recall from Q10 in HW6 that using thetraining count data set, for a given digit k (k = 0, 1, · · · , 9), we can get the sample points x1, x2, · · · , xn ∈ Rdfor true digit label k with xi = (xi1, · · · , xid). Then for digit k, its MLE p with d = 49 can beobtained byUsing the training count data set “mnist_train_counts.csv”, perform the following exercise Q1-Q3.Q1For digit k = 5, extract the sub-sample of the training data set that corresponds to the true digit label “5”.Print out the sample size of this sub-sample.Q2For digit k = 5, apply the MLE formula on the extracted sub-sample in Q1 to find the MLE p?k = p?. Printout your answer.Q3Repeat Q2 for each digit k = 0, 1, · · · , 9. For grader to verify your answer, print out a d × 10 matrix thatcontains all pk for k = 0, 1, · · · , 9, that is, the jth column of this matrix is pj1.Next, we will use this “naive” probabilistic model to make prediction for the testing data set“mnist_test_counts.csv”.1Q4To warm up, suppose we want to make prediction for the 100th data point in the testing data set. Extractthis data point’s count vector x. For grader to check your results, print out x. In addition, find the sampleproportions πk (k = 0, 1, · · · , 9), which are from Q6 in HW6.To make prediction for the 100th data point with the count vector x, we can use the Bayes rule:y = arg maxk=0,1··· ,9πfk(x | pk) = arg maxk=0,1··· ,9,where the function g(x, p) given x = (x1, · · · , xd) and p = (p1, · · · , pd) isg(x, p) = log�Y�xj log pj .Q5Write an R function named gfun(x, p), which returns output for g(x, p). For grader to verify your answer,print out the outputs of gfun(x, p) using the 100th data point’s count vector x and pk for digit k = 5. Note:when implementing gfun(x, p), how would you handle the possible situation that pj = 0?Q6We are now ready to make prediction for the 100th data point. Use the function gfun(x, p) above to calculatelog πk + g(x, pk) for all k = 0, 1, · · · , 9 and find your label prediction y. For grader to verify the results, printout all these outputs.Q7Now let’s look at the true label for the 100th data point. Print out the true label y and I(y 6= y). Does yourprediction give the correct label?Q8Repeat the process above to perform prediction for all the data points in the testing data set. Calculate themisclassification error rate for the the “naive” model bymisclassification rateis the predicted label, yiis the true label, and N is the sample size of the testing data set. Inaddition, save your label prediction as a “myprediction.txt” file, with the ith row representing your predictionyi. Specifically, suppose yhat is the vector object that contains your prediction, you should use the followingcode to generate the file “myprediction.txt”:write.table(yhat,file=myprediction.txt,row.names=FALSE,col.names=FALSE,sep=)Any other format of your prediction file will NOT be graded.2本团队核心人员组成主要包括BAT一线工程师,精通德英语!我们主要业务范围是代做编程大作业、课程设计等等。我们的方向领域:window编程 数值算法 AI人工智能 金融统计 计量分析 大数据 网络编程 WEB编程 通讯编程 游戏编程多媒体linux 外挂编程 程序API图像处理 嵌入式/单片机 数据库编程 控制台 进程与线程 网络安全 汇编语言 硬件编程 软件设计 工程标准规等。其中代写编程、代写程序、代写留学生程序作业语言或工具包括但不限于以下范围:C/C++/C#代写Java代写IT代写Python代写辅导编程作业Matlab代写Haskell代写Processing代写Linux环境搭建Rust代写Data Structure Assginment 数据结构代写MIPS代写Machine Learning 作业 代写Oracle/SQL/PostgreSQL/Pig 数据库代写/代做/辅导Web开发、网站开发、网站作业ASP.NET网站开发Finance Insurace Statistics统计、回归、迭代Prolog代写Computer Computational method代做因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email protected] 微信:codehelp QQ:99515681 或邮箱:[email protected] 微信:codehelp