STAC51数据分析

Department of Computer and Mathematical Sciences
STAC51: Categorical Data Analysis
Winter 2021
Instructor: Sohee Kang
E-mail: [email protected]
Office: IC 483
Online Office Hours: Monday 5-6 pm and Wednesday 5-6 pm
(416) 208-4749
TA: Bo Chen TA: Lehang Zhong
E-mail: [email protected] E-mail: [email protected]
Course Description: In this course we discuss statistical models for categorical data. Contingency
tables, generalized linear models, logistic regression, multinomial responses, logit models for
nominal responses, log-linear models for two-way tables, three-way tables and higher dimensions,
models for matched pairs, repeated categorical response data, correlated and clustered responses
and statistical analyses using R. The students will be expected to interpret R codes and outputs
on tests and the exam.
Prerequisite(s): STAB27H3 or STAB57H3 or MGEB12H3 or PSYC08H3
Credit Hours: 3
Required Text: An Introduction to Categorical Data Analysis, 3rd Edition
Author(s): Alan Agresti
WebLink for 2nd edition: https://search.library.utoron...
Sub-text1: Categorical Data with R, 3rd edition
Author: Alan Agresti
Sub-text2: Analysis of Categorical Data with R (2014)
Author:Bilder C. and Loughin T.
Course Objectives:
At the completion of this course, students will be able to:

  1. use R software to conduct categorical data analysis.
  2. identify designs of contingency tables and recommend appropriate measures of association
    and statistical tests.
  3. develop models for binary response and polytomous categorical responses, interpret results
    and diagnose model fits.
  4. interpret and communicate categorical data methods to a technical audience.
    1
    Grade Components:
    Case Study and Presentation 15%
    Assignments 15%
    Quizzes 15%
    Midterm Exam 20%
    Final Exam 30%
    Attendance 5 %
    Course Policy:
    • Communication
    – Important announcements, lecture notes, additional material, and other course info will
    be posted on Quercus. Check it regularly. You are responsible for keeping up with
    announcements from instructors on Quercus and via e-mail.
    – Check “Piazza” before you send an e-mail, make sure that you are not asking for
    information that is already on “Piazza”. In general, I will not answer questions about
    the course material by e-mail. Such questions are more appropriately discussed during
    office hours of me or TAs.
    – E-mail is appropriate for private communication. Use your utoronto.ca account and
    include STAC51 in the subject line.
    • Oral Assessment
    If the instructor has a suspicion on your assessment result (the deviance is great) then she
    will conduct an oral assessment after. If the oral assessment result confirms the suspicion
    then the previous assessment score will be replaced to 0.
    • No makeup quizzes or exams will be given.
    Learning Components:
    • Tutorial
    Students are expected to attend the weekly tutorial to gain practical R programming experience.
    Quizzes will be conducted in tutorial. You need to turn on videos so that TAs can
    invigilate.
    • Assignments
    Three assignments (each 5%) will be distributed. All assignments are group works (two team
    members) unless you prefer individual work.
    • Quiz
    Three quizzes (each 5%) will take place after the assignments handed in.
    • Case Study and Presentation
    Students will be required to work on a case study as a group and to submit a report. The
    size of the group is maximum of FOUR. You can choose your group members. For a report,
    students will write R codes and interpret R outputs and will use R Markdown (R package).
    More details, such as the content and deadline, will be communicated later. No late report
    will be accepted. Each group will present the case study (5 minutes) at the last day of
    lecture.
    2
    • Attendance Attendance is expected and will be taken each class and tutorial.
    • Computing Statistical computing is a key part of the class. In-class analysis will be conducted
    in R and all course material (code and data) is in R format. R is free and available for
    download at http://www.r-project.org, and you can find manuals and installation guidelines
    on this site.
    For basics in R, here are suggested documents: R for beginners by Emanuel Paradis, An
    Introduction to R by W. N. Venables, D. M. Smith, and the R Core Team, A (very) short
    introduction to R by Paul Torfs and Claudia Brauer. More information and documentation
    are available on The R Project website. Students are expected to write R codes and interpret
    R outputs on assignments, tests, and the exam.
    Outline of Topics:
    Chapter Content
    Ch. 1
    • Introduction
    • Distributions for categorical data
    • Statistical inference for categorical data
    Ch. 2 • Describing contingency tables, independence of categorical variables
    • Comparing proportions, Relative risk, Odds ratio
    Ch. 2 • Inference for contingency tables, Chi-squared tests of independence
    • Exact tests for small samples
    Ch. 3 • Introduction to Generalized Linear Models: Generalized linear models for binary
    data, Poisson log linear models, Negative binomial GLMs
    Ch. 4 • Logistic Regression
    Ch. 5 • Building, Checking, and applying logistic regression models.
    Ch. 6 • Models for multinomial responses.
    Ch. 7 • Loglinear models for two-way tables, Loglinear models for three-way tables,
    Inference for loglinear models.
    Ch 8 • Models for matched pairs.
    3
    University Policies
    • Academic Integrity:
    Academic integrity is essential to the pursuit of learning and scholarship in a university,
    and to ensuring that a degree from the University of Toronto is a strong signal of each students
    individual academic achievement. As a result, the University treats cases of cheating
    and plagiarism very seriously. The University of Torontos Code of Behaviour on Academic
    Matters (http://www.governingcouncil.u...) outlines the behaviours
    that constitute academic dishonesty and the processes for addressing academic offences.
    Potential offences include, but are not limited to:
    In papers and assignments:
    – Using someone elses ideas or words without appropriate acknowledgment.
    – Submitting your own work in more than one course without the permission of the instructor.
    – Making up sources or facts.
    – Obtaining or providing unauthorized assistance on any assignment.
    On tests and exams:
    – Using or possessing unauthorized aids.
    – Looking at someone elses answers during an exam or test.
    – Misrepresenting your identity.
    In academic work:
    – Falsifying institutional documents or grades.
    – Falsifying or altering any documentation required by the University, including (but not
    limited to) doctors notes.
    All suspected cases of academic dishonesty will be investigated following procedures outlined
    in the Code of Behaviour on Academic Matters. If you have questions or concerns about what
    constitutes appropriate academic behaviour or appropriate research and citation methods, you
    are expected to seek out additional information on academic integrity from your instructor
    or from other institutional resources (see http://www.utoronto.ca/academ...).
    • Accessibility:
    Students with diverse learning styles and needs are welcome in this course. In particular,
    if you have a disability/health consideration that may require accommodations, please feel
    free to approach me and/or the AccessAbility Services Office as soon as possible. I will
    work with you and AccessAbility Services to ensure you can achieve your learning goals in
    this course. Enquiries are confidential. The UTSC AccessAbility Services staff (located in
    S302) are available by appointment to assess specific needs, provide referrals and arrange
    appropriate accommodations (416) 287-7560 or

你可能感兴趣的:(算法)