I liked this class. The student, TA, and professor discussion was useful and much more pertinent than I ever got in college or medical school. I made a ton of progress in my analytical and coding toolboxes. But many of the criticisms were accurate about the course mechanics and especially the final exam. I will probably get B's in both classes because I got nailed on the part II of the final exam for each class.
How to beat the exams, this applies to 6402 and 6414!
(1) get the course transcripts available the first week. Take notes and make corrections, add slides or equations to them, manipulate formulas to make sure you can apply them. 70% of the MC/TF questions come EXACT from the notes so the EXACT wording actually makes a difference. I used EXACT three times now...I think you get the point. Understanding a concept in general is not enough. The other 30% comes from applying the formulas, graphics, and concepts to solve multiple choice problems (usually worth multiple points and may have multiple correct answers). Many of those questions were "plug and chug" IF you have the correct formula.
(2) The midterm was fine. Final was massive for the amount of time. People that use R in their jobs said they could not finish.
The thing I noticed was that all the exam part II (Regression and TSA) questions were combinations of the homeworks or sample codes given in the classes. The questions were fair. The problem was adapting them fast enough to use on the test.
Get together in groups to save time. Go through all the R code examples in the class and homeworks. Turn them into "GENERIC" functions....SHARE WITH YOUR CLASS AND CHECK THAT THEY WORK. If you can just transform your data and feed it back and forth into your canned functions (you can copy and paste them from R or Rstudio into your test document) it would allow you to get done with time to spare. you can even go so far as to put in some canned explanations with fill in the blanks (rememeber MADLIBS as a kid) accompanying your code and all you have to write is (IS/IS NOT CORRELATED) (IS/IS NOT MULTICOLLINEAR ....etc) (HYPOTHESIS TEST < > p.value indicates ...)
If you built the functions well you will have to alter less code to adapt it for the exam.
you can lose 30-60 minutes if you mismatch your data types or need to debug subscripts to ensure alignment. If you have to do that AND adapt your code you will have a rough day.
None of these are hard, you will learn to do them, but you need to move fast. This is what I remember off the top of my head. Disclaimer...I do not have perfect recall so forgive me if I make a mistake here.
1 Splitting a dataset into training/test
2 performing data analysis and graphics including boxplot, scatterplots, correlation, correlation matrices
3 Models for linear regression, multiple linear regression with and without log transformed predictors and responses
4 Models for logistic regression with repetitions for error analysis and without repetitions for classification errror.
5 Poisson model
6 Subset model (forward, backward, both)
7 all subsets models(LEAPS library)
8 hypothesis test (normality, correlation)
9 residual plots and analysis
10 plotting the data and a fitted model on the same plot
11 using your models to generate predictions
12 calculate your prediction errors (MAE,MPSE, MAPE...) and RMSE
13 perform elastic net, ridgem and LASSO regression along with accompanying plots and prediction errors
Time series
1 Take a series of dates from factor or string to a DATE
2a Fit a time series data set with a linear, polynomial, quadratic, harmonic, and splines regression, and be able to predict a few points ahead.
2 plotting ACF, PACF, for original data, differenced data, residual data, and squared residual data
3 arrange time series forward or backward in time
3a plot multiple time series plots aligned on the same graphic
4 be able to use forecast to predict values and plot them on top of a time series (using base package or ggplot2)
5ARIMA - including submodels AR and MA and ARMA and EACF
6GARCH
7 ARMA-GARCH iteratively
8 plotting squared residual plots to check for autocorrelation
9 Calculating and interpreting hypothesis testing for significance, normality, correlation
10 apply a vector autoregression model
I recommend both classes