Course Details

Course Instructor: Nan Zhang (Office hours by appointment)

Course Website: https://nanzhangresearch.github.io/Data_Analysis_Tutorial

Date and Time: Fridays from 13:45 - 15:15 in C-108 Methodenlabor


Course Description

This tutorial course accompanies the lecture “Datenauswertung: Data Analysis for Political Scientists.” We will practice basic methods of data analysis using the statistical software package Stata.


Course Requirements

Attendance and participation: Classtime will consist of a mixture of hands-on practice with Stata programming and group discussion / problem-solving. I’m also happy to answer any questions you have regarding the material covered in Sean’s lecture.

Active, in-class participation is central to your learning process. Of course, situations could arise where you need to miss class. As a courtesy, please let me know beforehand if you cannot attend a class session.

Optional homeworks: we will provide you with optional weekly homeworks where you will have the opportunity to practice each week’s concepts at your own pace. These are not graded (and you should not submit them), but they will help you to reinforce what we learn in class and thereby prepare for the final exam. We will also provide homework solutions.

Required Assignments: There will be three assessed assignments over the course of the semester. Each of these assignments will be graded as either a pass or a fail; you need to pass each assignment in order to pass the tutorial. There will be one opportunity to retake an assignment if the original assignment is not passed.

Assignment due dates are:

The assignments must be completed individually. You must make a written declaration that the work is wholly your own when submitting your answers. Anyone discovered to have colluded or plagiarised from others will be failed.


Weekly Schedule

Session 1 (16 Feb): Organizational issues and introduction to Stata

Session 2 (23 Feb): Data manipulation; frequencies, measures of central tendency and dispersion

Session 3 (1 March): Graphs

Session 4 (8 March): T-tests, hypothesis testing, statistical significance

Session 5 (15 March): Measures of association for nominal variables

Session 6 (22 March): Measures of association for ordinal and interval variables


Easter Break


Session 7 (12 April): Bivariate Regression

Session 8 (19 April): Multivariate Regression

Session 9 (26 April): Regression assumptions I

Session 10 (3 May): Regression assumptions II


Break for Ascension / Christi Himmelfahrt

Nan will hold (virtual) office hours during regular class time if you have questions


Session 11 (17 May): Regression with categorical independent variables

Session 12 (24 May): Logistic regression


Break for Corpus Christi / Fronleichnam

Nan will hold (virtual) office hours during regular class time if you have questions


Reference and Help Materials

Stata Starter Manual. Useful if you are opening Stata for the first time.

glossary.do. A summary of commands we’ll cover over the semester that you can download and annotate for yourselves.

Statalist Forum: https://www.statalist.org/forums/help. Can be very useful for specific questions. You can either search for existing posts or add your own question if you can’t find an answer.

Stata Guide: https://wlm.userweb.mwn.de/Stata/. Short introductions to some frequently used commands.

Stata Cheatsheet: http://geocenter.github.io/StataTraining/portfolio/01_resource/. A glossary that summarises some popular commands and their syntax.

Official Stata Youtube channel: https://www.youtube.com/user/statacorp. Very helpful how-to guides on some simple as well as more complex procedures.

Anywhere Maths: https://www.youtube.com/channel/UCRkeyHV2bANRrFjesu_wdLQ. Mostly useful for a recap of general mathematical questions and expressions. The playlist Statistical Measures might be particularly useful for you.