Note this review is actually for spring 2024.
This is an OK class. For background I've been in the program since 2022 and this was taken along with 2 other class while I took a semester off from work to do this full time. I'm in the ML track and so have done 7643 and was doing DL concurrently with this.
LECTURES:
The lectures by Prof Reidl are great. The only criticism is that they are a bit long. That said you will always leave the lectures understanding what concept is being taught. I credit this class for really understanding LSTM, Transformers and some other DL things.
There are some Meta AI lectures. These are definitely better than the ones in DL but not great. If Mark's lectures are a 5 then these are a 3. The info is good for the most part and they mostly get the point across
Most of the lecture content in this course is focused on actual NLP. That means the BOW model, machine reading, embeddings, information retrieval and more. This is a survey class, it's not a class about LLMs. You spend only a bit of time on them really. If you want to learn about LLMs take DL and then watch some of Andrej Karparthy's great YouTube videos.
Quizzes:
There is a trivial quiz at the end of each section. These are a little challenging because the sections can be very long (2+ hrs) and can be a bit detail oriented. You get two tries on each and they are open so can't complain here.
Homework:
Oh boy... This is a very big disappointment. Coming out of ML and seeing what DL is doing with HW I was really left scratching my head. The assignments are all very, very easy jupyter notebooks. Most time is spent debugging the testing framework you are given or trying to understand what you are actually required to do. None of them are super hard but I found them very annoying.
My bigger frustration is that they do a poor job reinforcing lecture. In DL you have to implement a transformer from Pytorch primitives. The furthest you go here in terms of DL is implementing an LSTM. We don't even use the built in PyTorch transformer module at any point. I'm not sure why an NLP class doesn't have you work with one of the most important NLP mechanisms out there right now.
As is I also don't feel the HW does a great job of reinforcing what is in lecture unless you do the notebook while watching the lectures at the same time. They almost need to be thought of as a companion to the lectures. So if that was the goal when creating them that is reasonable.
PROJECT:
The project was to implement an attention mechanism. I think the project would have been a great opportunity to either use the PyTorch transformer or build one from scratch. The lecture walk through is good enough for that. Instead we implement a mechanism from 2016 that doesn't really have a purpose in this day and age. The idea is that we are trying to get experience working with attention mechanisms but that's really a small part of it.
I will say it is good that they force you to write your train and test loops from scratch. This aligns with the expectation that the prior notebooks have trained you for the project. I am just not convinced the project does a good job of reinforcing NLP since it's so hard to get good results from this sort of network. I'd give this project a 2/5 in current state.
Exams:
Midterm was childish. Felt like something from high school. You are given a set of question you must answer in a word doc like format based on a paper you have read. The problem is that they really don't let you flex your muscles if you are used to writing quality papers like you would do in ML4T, ML or even DL to an extent. They also expected you to identify very specific things, even though you are ostensibly just giving a summary. This was the lowest point of the class for me. 1/5.
Final was open everything short respsone. They ask potentially very deep systems design questions but then expect you to answer 1-2 paragraphs. Based on the grading for midterm being very specific I found this to be impossible. Weirder still is that half the questions seemed to focus on LLMs. This was strange since we spent very little time talking about them in the class. When asking the TAs if we should incorporate stuff from outside the class we were told it shouldn't be needed. Not sure how that works since some questions had virtually no coverage in the course.
Now that said, none of the questions were hard. I just don't understand why we were being asked those questions given everything else we learned in the class.
Summary:
So who is this class for? I think this is a good course if you are in the CS track and want to try something ML related that isn't brutal like DL and is not ML4T. This course does a great walkthrough of NNs and can take you from nothing to writing PyTorch code. That's pretty cool.
If you already know NNs, have taken DL or something like that then some of the non NN NLP stuff is interesting. I just don't think the treatment is rigorous enough to really learn a lot. I now know a bunch of concepts but I don't feel I can strongly apply them. The NN side get's completely covered in the 4th section of DL. You will learn nothing new here related to NNs if you have taken DL. Again, you never use a transformer in this class.
The TAs are trying their best. I think this course could be super great if you have the lectures form the base and then have the students go out and extend them. I think lectures can be tightened up and can assume a bit more prior knowledge. I could have skipped the whole module on Bayes having seen it in AI4R for example. Have this go from a 10hr/week course to a 15, clean up the notebooks and make requirements more clear (why is GT allergic to type hints?), and give us a more modern project. Do that and I think you have a winner.
I got a mid A for what it's worth. Many weeks I completely forgot about this course.