Computational Data Analysis: Learning, Mining, and Computation

4.10 / 5 rating3.63 / 5 difficulty14.22 hrs / week

Quick Facts and Resources

Something missing or incorrect? Tell us more.

Name
Computational Data Analysis: Learning, Mining, and Computation
Listed As
ISYE-6740
Credit Hours
3
Available to
AN students
Description
Theoretical/computational foundations of analyzing large/complex modern datasets, including the fundamental concepts of machine learning and data mining needed for both resesarch and practice.
Syllabus
Syllabus not found.
Textbooks
No textbooks found.
  • oI/ZNriAtlOWVqKKTb3FUw==2024-01-10T20:45:09Zfall 2023

    This was a great course. A very strong range of algorithms which you get to code for each one (sometimes from scratch) with strong math proofs and demonstrations behind each topic. Don't stress about the math it's nothing too crazy just abstract derivatives (chains rules, log properties), Lagrangian multipliers (but the professor always goes over the math very well for each algorithm). While the course certainly covered these topics in depth, it wasn't the focus (about 20% of the points in HWs). The lectures were very well balanced in theoretical, math and practical terms. Unlike other courses where you never have to see the lecture videos, here it's an absolute must, it will help you immensely in doing the homework assignments (6 total), you also definitely should use the starter code skeletons for each topic and homework assignment. You learn about a bunch of machine learning and stats algorithms, from an overview of what they are, do and used for, to the their objective function, what they optimize and finally how to use it in Python or matlab. Course deals a lot with images, so a lot of the time you aren't working with traditional dataframes rather their favorite is a series of Yale face pictures reduced in dimensions for modelling. I had never worked with images so learning how to get rows and columns from a picture was tricky but it's not too bad if you have worked with Python before. This was my third course i took it simultaneously with DataViz and Computing for Data Analytics. Definitely a mistake you should deff take CfDA before this class and NEVER partner it up with DataViz, I got an 88 in this class which is a flat B. But I can genuinely say I learned a ton and it was worth the suffering.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • +jTAa/Y57/V1BqtRCslpQA==2024-01-09T06:03:38Zfall 2023

    Great course! I think it's a must! Great intro to various ML applications. I combined it with MGT 8803 which made the time commiment restricted. I wish i had more bandwidth to provide towards this course. It was a tough course for me because of lack of bandwidth but overall a great course. Make sure to find a good teammate for the project.

    Rating: 5 / 5Difficulty: 5 / 5Workload: 14 hours / week

  • kIKa6VFkh7VonzhyIGkk5w==2023-12-13T22:00:52Zfall 2023

    My background

    Took this course immediately after IAM and iCDA. Self-studied linear algebra. Minored in CS for undergrad, graduating five years ago, work as a data scientist. Very comfortable in Python, totally new to LaTex.

    Highlights:

    • Homeworks do a good job of forcing you to learn a ton of novel concepts in machine learning. In a perfect world, students would code every ML model from scratch without sci-kit learn, but I think the right trade-off between breadth and depth is made here to explore more models later in the course.

    • I spent 11 hours/week on the class, with the exception of a lull in the middle. A mid semester break coincides with the easier middle assignments which is a nice respite.

    • The project does a great job of letting students take the training wheels off. You learn a ton of models, pick a new dataset, and have to navigate the ambiguity of picking the models that make the most sense for your research question.

    • TAs did a great job leading forum discussion and answering questions. Alfie was a remarkable head TA.

    Areas of Improvement:

    • The total lack of grading on code quality isn't setting students up for success professionally. I would setup an auto-grader for the coding portions like in iCDA and keep the LaTex reports for visualization and discussion.

    • After iCDA and IAM, the two hour multi-topic lectures feel indigestible. The actual content is good and concepts are well explained, but scrubbing through the video to find the five minute excerpt relevant to what you're currently working on is so cumbersome.

    • I know the purpose of the mathematical proofs/derivations is to build an intuition for how and why the algorithms work, but I was so underwater on calculus that I didn't learn a thing. I'd replace these with more exercises that force the student to perform the computation of an algorithm of a toy dataset by hand. Doing this for AdaBoost was a highlight of the last homework.

    • Institute peer reviews to speed up the grading cycle. By the time I got my 79/100 grade back for my first assignment, I had already submitted the second and was well into the third. Makes it hard to course correct! (Note I still managed a low A in this course, if you put in the effort, you will be fine). Also I learned a lot doing peer reviews for the final project, it's also a learning opportunity to see how others approach the assignments.

    The resources that helped me the most (I have no relationship to any of these resources, my opinions are my own, wasn't paid to endorse):

    • Mike Cohen's Linear Algebra: Theory, Intuition, Code does a marvelous job explaining everything up through Eigendecomposition, which in my opinion is 95% of all the LA you need for this course.

    • Every week that went by, I watched the professor's lectures less and less and Josh Starmer's Stats Quest Youtube Channel more and more.

    • Overleaf (student plan $9/month) probably saved me two to three hours of LaTex formatting time per project.

    General Recommendations

    • The quality of your code is totally irrelevant to your grade on each homework. TAs are opening your PDF report and going off its contents alone. Any hour you can spend polishing the report, notation, visuals etc. is going to have a positive impact on your grade where a code refactor will not.

    • I wish I had known how much multivariable calculus was necessary to complete the 1-2 proofs in each assignment. If you're like me and taking this hot off the OMSA intro courses, you may be in the same boat. IAM and iCDA both provide a gentle introduction to Linear Algebra but little to nothing on multivariable calc, so you may want to brush up.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 11 hours / week

  • mprzBRYbbr58cGeRJhueGw==2023-04-28T15:23:04Zspring 2023

    Overview: I am in my final semester (other than the practicum) and this has been one of my two favorite classes in OMSA (along with Data Mining, IE-7406). I would recommend this to EVERY OMSA student. This class is the reason you want to go into Data Analytics and Data Science.

    My Background: So everyone knows where I am coming from.... I am only a marginal Python programmer and weak in probability/statistics, but I am very strong in calculus, linear algebra, and analytic modelling.

    Course Layout: 6 Homeworks - generally due every 2 weeks, and 1 Project due at the end of the semester. NO Exams!

    Personal Notes: I was originally intimidated by this class because it seemed to be a high level coding class where you are re-creating already developed models from scratch - and solving complex coding problems is distinctly not my strength. While the building models from scratch is generally true, it turns out that most of the code you need is provided to you in the sample code examples. Furthermore, after the first 2-3 homeworks, you are using generally using packages rather than recreating the wheel. In fact, by the second half of the semester, I was only having to work on this class about every other week because I would be able to finish the lectures and the homework the first week and wouldn't have anything to do until the next set of lectures/homework was released at the end of the second week. Overall, this class has been much easier than I originally expected/feared and has turned out to be one of my favorite classes in the entire program.

    Challenges/Observations:

    1. The grading is very lenient, so don't avoid this class because you don't think you can pass it. It is almost impossible to get less than a B if you put a good faith effort in doing the homework and you turn all the assignments in. Right now, I think the class average is about a 96.
    2. The theoretical questions can be tough. If you don't have a strong math background they will take a good bit of time. I found that the statistics based questions generally took me several hours to research and answer (Google is a Godsend!). For me, the proofs based on derivatives and integrals were not very difficult, however.
    3. If you have experience using analytical models, the modeling questions are straightforward. If you don't have a strong background in analytic modelling, I would recommend taking the Data Mining class first. While most of the problems there are based in R rather than Python, that course is really an applied analytic modelling course. It will give you a GREAT background in what each type of model is doing and how it works.
    4. Get comfortable with linear algebra! The first half of this course is converting linear algebra equations to Python code. If you don't understand the linear algebra, it makes everything else that much more difficult. Even though I have a good background in linear algebra, I did brush up on it before starting the class, and that review helped immensely.
    5. Pick a project early! I built on a project I started for IE 7406. That way, even though I had to convert the code from R to Python, I already had the foundation laid and could work on some more sophisticated aspects of the problem I could not get to in that class. If you don't have something already handy, don't wait until the end of the class to pick a project. This is also where having taking IE 7406 would help. You learn MANY more modelling techniques in that course, so you will have a larger toolbox to choose from for your project.
    6. Do not plagarize/cheat! The professor is very serious about this. If you need to use some some small sections of code you find online to help finish your homework that is fine but provide attribution in your script!!! There have been some people get in big trouble this term for plagarizing code.
    7. There are tons of office hours - typically several per day. Take advantage of them. Even if you can't attend them in person, they are recorded so that you can listen to them later. That is typically what I did.
    8. Read the ED discussions regularly. If you are stuck on a homework, it is a given that a bunch of other people are stuck at the same place. There were always helpful tips and hints in ED. A lot of times people would post their graphical outputs as part of thier questions, and comparing your output to these would give a great indicator of whether your code was correct or not.

    Best of luck and enjoy this course!

    Rating: 5 / 5Difficulty: 3 / 5Workload: 14 hours / week

  • eS/hfZfXrmyKAfuOqr5qng==2023-04-26T17:21:22Zspring 2023

    Professors videos have low sound quality and her thoughts are often all over the place. A lot of the actual learning and understanding came from the TAs through ed discussion or the many office hours that you can attend and ask questions.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • YcRCrpL/0MAbjrBk0BQjiA==2023-01-18T18:40:50Zfall 2022

    Challenging, but interesting and engaging. Homework assignments were tough and took a lot of effort, but in many cases were very satisfying because they functioned more like labs: you get to see your work function in some real way - as an image compression algorithm, face recognition tool, etc. and don't have to wonder if you did it right.

    Two minor complaints - one is that the course was too heavy on mathematical theory for the name "Computational Data Analysis." This is a professionally-oriented degree where the academic derivations aren't going to be as valuable to 90% of students as the practical ramifications.

    The second is that some of the homework problems should be explained better. For example, you might be provided parameters for generating training data as well as for developing a model, but it's unclear what information should be treated as unknown for which step. So sometimes the already challenging assignment begin with an hour or two of head scratching and deciphering cryptic hints from TAs just to figure out what you're supposed to be doing.

    Rating: 4 / 5Difficulty: 3 / 5Workload: 8 hours / week

  • Georgia Tech Student2022-05-03T20:11:17Zspring 2022

    Superb!

    Rating: 5 / 5Difficulty: 3 / 5Workload: 12 hours / week

  • Georgia Tech Student2022-02-17T03:29:15Zfall 2021

    This is a must for OMSA folks. Thoroughly enjoyed the content, homeworks, and project. The teaching team were fantastic. No background in maths nor stats but learnt a lot and managed to survive with an A.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 15 hours / week

  • Georgia Tech Student2021-12-21T14:31:23Zfall 2021

    Overall I thought this class was a good challenge. For practically every assignment, there was at least one part where I felt completely incapable of completing that portion of the assignment upon first glance, but after examining the demo code and participating in Piazza, it was feasible.

    I took this class straight off the core requirements (6501, 6040, 8803), and I would say I wish I had a bit more of a math background before getting in. I have taken up to calc II, linear algebra, and a probability / stat course (though that one was ~5 years ago), which I thought would be enough to learn key points on the fly. I ultimately was able to get an A, but I straight-up didn't answer questions worth 15 and 35 points (out of 100) on two homeworks despite hours of trying to understand the math. The math proof questions I was able to answer I would admit I didn't understand in great detail. I think the main gap was a lack of strong probability / stat understanding, and an extremely weak grasp of multivariate calc (which I'm taking over winter break for future classes). That said, understanding linear algebra did make the class a lot easier, especially early sections where you're coding based on linear algebra equations in lectures and demo code. Python programming ability is also crucial (my main experience was 6040 where I got a comfortable A).

    Due to my specific context the main things I've taken away are:

    • Being comfortable using LaTeX to write math expressions.
    • Being comfortable using sklearn ML packages, with a better understanding of specific parameters to tune these.
    • Being less intimidated by math notation when trying to understand how ML algorithms work.
    • Implementing ML algorithms from scratch, when relevant, based on math notation in online resources (I had to do this for my project).
    • Better understanding of some algorithms we cover in more detail in this class when compared to 6501, like logistic regression, SVM, etc.

    Things I didn't take away that I expected to:

    • A deep understanding of how many ML algorithms work behind the scenes (again, due to my own math abilities coming in). I do feel more comfortable in understanding these algorithms at a high level and doing deeper research if needed, though.

    Ultimately, I think I'm going to go in a less theoretical machine-learning direction with the program after taking this class (I was originally hoping to take HDDA and DL; now I'm looking at BD4H and DBS). That said, I'm glad to have had this experience, and I do think this class is a great option even if you're not super interested in going deep down a machine-learning track in OMSA.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • Georgia Tech Student2021-12-20T00:01:23Zsummer 2020

    I liked this course a lot!

    The focus on 'from scratch' machine learning was really cool (and refreshing, after 6501/6040), and I thought the TAs were very responsive and helpful. There's so many office hour sessions, so I always felt like I could talk to someone in person, which helped for some tricky assignments (looking at you EM algorithm).

    The grading did feel a bit lenient though. If I had a suggestion, it would be to make that a bit stricter...would make it a bit easier to tell where I actually got things wrong, and where my grader was being nice.

    All in all though, this was a great ML 101 type course, and I'd recommend it to any aspiring data scientist.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2021-12-18T02:44:46Zfall 2021

    This is one of the best courses in this program. I believe its a great extension to the 6501 course.

    Value added from this course: a) Helped me understand the 'WHY' behind each algorithm.
    b) Made me think about selection of algorithm based on the structure of the data c) Made me confident about tricky aspects like hyper-parameter tuning, feature selection, etc. d) Made me comfortable with the mathematical concepts and this makes understanding papers easy.

    Math Prereq: I took simulation in prior semester so the required calculus and probability concepts were covered . In addition, I just spent 10hrs on Multivariate Calculus and one weekend on Linear algebra concepts.

    Class experience: The TAs are really good and active. Prof X also holds OH once every week to resolve any doubts. Assignments are made so as to help students assimilate the concepts from the lectures. Also, I found the lectures interesting (contrary to other reviews here)

    This is a fantastic course and helped significantly improve my ML fundamentals. Without this course, the OMSA experience would remain incomplete in my view.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 18 hours / week

  • Georgia Tech Student2021-12-07T17:09:57Zfall 2021

    Having taken both DMSL and CDA, I could see why people prefer DMSL over CDA.

    However, let me offer you a different perspective.

    DMSL is about practical aspects. However, would you feel confident after taking DMSL what a support vector machine is? No, I can't. DMSL is literally a paper-pushing course that you apply, you leave. I don't retain much information there even though it was a well-taught course.

    CDA however, brings you a deep dive. It's tough, yes. But it equips you the confidence and the ability to DEFEND your models in front of the auditors and management and argue why your models work.

    That's the value proposition of this class, and explains why my pay got augmented when I manage to take this class and got promoted in a more specialist role.

    I will be taking HDDA and DL next to complete my ML theory-driven knowledge.

    Rating: 5 / 5Difficulty: 2 / 5Workload: 16 hours / week

  • Georgia Tech Student2021-11-12T00:23:52Zfall 2021

    I absolutely loved this class. Homework assignments require a significant amount of time but that's where I always learned the most. Following along in Piazza is a must, but other students always had the same questions and the TAs and prof. were extremely responsive, helpful, and willing to take the time to explain every single question asked. Lectures are helpful too, but it's not required that you watch every minute of them to complete homework assignments. No exams. Overall this was an incredible class that truly expanded my knowledge of ML. Being solid in Python was a huge help.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • Georgia Tech Student2021-09-19T19:14:37Zfall 2021

    DO NOT TAKE THIS CLASS IF YOU ARE NOT EXTREMELY COMFORTABLE WITH IMPLEMENTING LINEAR ALGEBRA IN CODE

    The material in this class is very interesting and provides a ton of insight into how machine learning algorithms work and the math behind them. However there are some issues with the structure of the class, which get in the way of learning. Homeworks are due every two weeks and the first few have taken me 20-ish hours to complete each. The instructors assume you have a very solid grasp of linear algebra and implementing it in code, and there is little guidance provided in actually implementing the algorithms in each homework. General Python knowledge is also very important. There have been multiple occasions where I've spent 1-2+ hours just loading and transforming the data before beginning the actual assignment. Unlike the notebook homework structure of CSE-6040, only raw files are provided to you and you have to figure out yourself how to get it in the appropriate format.

    Issues I have with the class are as follows:

    Terrible lecture structure: Every other class I've taken in OMSA has had lectures split up into several smaller pieces, which makes it easy to make sure you understand each concept before moving on, and is very helpful when you need to go back and review a certain topic. In this class, lectures are 45-75 minutes long and are essentially the professor sharing her screen and narrating over the powerpoint. It's difficult to follow and nothing is indexed, so if you want to review something you'll need to scrub through a video to locate it.

    Contradictory information: Multiple times the information provided by TAs on Piazza contradicts that which is provided in the lecture. For example, the ISOMAP lecture says to create an adjacency matrix of the l-2 norms, whereas the Piazza TA answer is that the adjaceny matrix should be 0s and 1s. It's very frustrating when topics that take a significant amount of effort to understand have contradictory information provided by professors and TAs.

    While the class overall is very interesting, the structure could use some serious improvement.

    Rating: 2 / 5Difficulty: 5 / 5Workload: 20 hours / week

  • Georgia Tech Student2021-08-07T20:33:57Zsummer 2021

    I took the course during the summer after taking ISYE7406 DMSL. I can see the pros and cons between these two courses clearly.

    In terms of practicality, DMSL is a big winner. CDA is focusing a lot more on the details implementation of specific algorithms, but not how to solve the real-world problem systematically.

    For the assignments, CDA is more coding-oriented. It is extremely difficult to read the HW instruction and implement the solution without closely following Piazza. A lot of important details need to be clarified. DMSL is more like an open-end assignment, which give you more flexibility but still subject to the quality of peer reviews. It doesn't mean that TA grading in CDA is better than peer reviewers in DMSL. Peers in DMSL are from diverse background. Some have a lot of practical experience; some are not.

    Project. DMSL has prepared the assignments as building blocks for you to successfully create the project. CDA is just a project without much guidance unless you spend time attending office hours discussing with the TAs. DMSL is a clear winner in terms of experience. CDA is still pretty much focusing on the technicality of the project.

    In summary, if you want to learn how the library really work inside-out, CDA is your best bet. If you want to learn a systematic way of solving problems, DMSL will clearly give you one.

    Rating: 2 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2021-07-15T17:03:58Zsummer 2021

    Good things:

    • interesting assignments
    • a lot of TAs sessions to ask question regarding a homework task (the answers though sometimes contradict each over and common sense)
    • professor has weekly OHs and it feels like a real class. Professor is very nice and engaging
    • generous grading (hell yeah! It feels very rewarding to get good grades for hard work)

    Bad things:

    • lectures: a lot of extra words, long-long bla-bla-bla in bad English, boring and sad
    • math derivations

    Things to improve: maybe write down short-sweet scripts to lectures (a-la prof. Sokol style) and read them instead of improvising. Would be worth doing considering there are about 400 ppl in the summer class. 400 ppl watching the same bad stuff... It is kind of disrespectful to provide such bad quality content, IMHO.

    P.s. I took the class in the summer, I wouldn't recommend doing so though. There are only 5 hws (with the 5th hw being very hard and time consuming) vs 7 in the full semester . You would benefit from doing more hws. No material is left out for the summer, instead it squeezed to fit shorter schedule.

    Rating: 3 / 5Difficulty: 4 / 5Workload: 25 hours / week

  • Georgia Tech Student2021-05-10T17:31:58Zspring 2021

    This course can be viewed as advanced statistical learning course where you learn a bunch of new and advanced machine learning methods, in addition to in-depth dive into other methods you are already familiar with from previous courses. It balances well between the theory and practice, and helps you understand the math behind the algorithms while at the same time giving many interesting applications that help you understand how these will be implemented in real world. The assignments are excellent and are the best aspect in this course. While they are challenging and time consuming, the instructions are clear with well-defined goals and there are hardly any ambiguity or errors, so you will be spending the time on solving the problems rather than trying to understand what is required by the question like what many other courses make you go through (such as DVA). The professor is very engaging, holds weekly office hours, and posts walk-through videos for each assignment, and is very active on Piazza more than any other professor in OMSA (from my experience). The TAs are very responsive and professional.

    So, with all the positives above why I gave it 4/5? This is because the quality of the lectures in my opinion are very inferior compared to the other great aspects of this course. For most topics, I relied on external resources to understand rather than on the videos. I think that they will be much better if they follow the short-videos style of other courses (example simulation), and started with some background on the subject rather than diving straight into math. I am not saying however to abandon the math as I found it to be very interesting and helpful to enhance the understanding, but it is better to start by giving some intuition so that the math will make sense. Also, another thing that was extremely disappointing is that the final project, which is worth 40% of the entire course grade, was graded without giving any feedback, so you simply get your grade in the last day of class with no explanation at all, just the grade!!!

    All in all, one of the best courses in OMSA, and is recommended to anyone who aims at a career in ML. However, be aware that some assignments rely heavily on linear algebra and statistics, so expect to put extra time if your background is not strong in these.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • Georgia Tech Student2021-05-07T18:19:32Zspring 2021

    I had a very hard time in this course, but also had some fun as well. I'm rating this neutral because there are some really neat things in this course but also some really poor execution.

    What you can expect: Professor Xie is energetic and clearly passionate about the topic. Her enthusiasm really helped when I was struggling with the material, I found her explanations somewhat helpful, but she explains things at a level that you will either understand, or not.

    The class is entirely homework based plus 1 large project. The grading seems to vary with homework worth 60-80% and the remainder assigned to the project. Spring 2021 started at 80-20, but was ultimately adjusted to 60-40.

    I found the homework to be interesting, but weighted too heavily on the theory, to the point of reducing the value of the assignment. I learned a few things, but very little was presented in any real-world application. The project was open, which filled part of that gap.

    The class would be much better if it were more application focused and less theory focused.

    Skills needed to succeed:
    Strong foundation in Python
    Strong foundation in Linear Algebra
    Decent Calculus

    Rating: 3 / 5Difficulty: 5 / 5Workload: 20 hours / week

  • Georgia Tech Student2021-05-03T17:04:31Zspring 2021

    It seems like this class has changed frequently over time, this review is for the Spring 2021 semester.

    This was my 6th class after the 3 Core classes, MGT6203 and Simulation. It was absolutely my favorite class so far, and also the hardest. Expect linear algebra and calculus on almost every assignment with some kind of derivation involved, as well as coding work.

    Prof. Xie has that casual brilliance you see when people really know their stuff. Her explanations in lecture videos, office hours and Piazza were thorough and well presented. For better or worse she does not 'dumb down' content so when dealing with a new topic that you may have less experience with it can sometimes be hard to follow and require additional research to build up that comprehensive knowledge. Sometimes that could be frustrating, but I always found it valuable and worthwhile.

    The flow of the class was great, 6 homework assignments and a group project. This semester the HWs counted for 60% of the grade and the project 40%, with extra credit possible on most homework assignments. The lack of high-stakes exams I felt gave a great opportunity to really dive deep into the concepts as you are working through the assignments and gave me a sense of a more balanced workload week to week.

    Contrary to what some other reviewers experienced, I found that the class did not get easier after the halfway point, the class switches from unsupervised to supervised algorithms and while packages are used more liberally and some individual homework questions are simpler I found the math work more than made up for the difference, with HW6 and HW4 being the hardest for me.

    Content usually consisted of 1-2 hour long videos, a follow up homework guide video (30-40 mins) and office hours by the professor and the TAs weekly (I only watched the professor office hours, these were also usually around an hour long).

    Grading felt very generous (dare I say too generous?) but most of the class consisted of pretty motivated individuals and it seemed like everyone was really working hard to understand the material.

    As for downsides, there was sometimes some poor phrasing/wording/grammar in the assignments themselves, sometimes needing some extra clarification from a TA later. Group projects are group projects, we all know the downsides there, even with a strong group like I had. Overall though, I feel much smarter after this class than before it and I highly recommend anyone at OMSA try to fit it in to their class plan.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • Georgia Tech Student2021-04-28T23:09:55Zspring 2021

    Course structure: 6 homeworks (60% of grade), 1 project (40%), and 0 exams. Yes, 0 exams. HWs can be difficult and time consuming, but by the time you finish them, you'll have learned a lot.

    Prof Xie is awesome. I've taken 6 classes so far and she was without question the professor that engaged most with students. Clearly loves and material and knows it well. Does a great job teaching it both in the lectures and in her office hours. TAs are also very active and usually very helpful.

    I was familiar with a lot of ML concepts before taking this class, but hadn't done anything rigorous like actually derive/code an algorithm. It probably saved me some time. I was also relatively comfortable with Python, which definitely helped. I could definitely see this course taking more time for people with less familiarity with the material or less comfort in Python.

    I would highly recommend this course to anyone considering it.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 12 hours / week

  • Georgia Tech Student2021-03-21T19:48:50Zfall 2020

    This is a very good course. I think the difference between CDA and ML from CS is that there is much more theoretical aspect in CDA. At least one question per homework asks you to do the algorithm by hand so you truly understand what the algorithm does. Homework 1-3 are very tough but after Homework 4, the difficult drastically decreases. So you're worried about you performance in the beginning, don't sweat it and you'll be fine! Professor Xie is amazing. She's definitely the most engaging professor I've met throughout all my time in this program.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • Georgia Tech Student2021-03-08T01:37:56Zspring 2021

    Cleared due to OMSCentral Owner being greedy.

    Rating: 3 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • Georgia Tech Student2020-12-28T17:33:32Zfall 2020

    This is a deep-dive and much more in-depth version than machine learning course in OMSCS. Homeworks really force you to understand inner details and the math behind ML algorithms, some of them asks you to implement the algorithm by-hand.

    I am so grateful took this class in my last term, so I can peacefully graduate from this program knowing that I finally understood the math behind popular ML algorithms.

    Professor Yao Xie and TAs did great jobs, very responsive in piazza. I've graduated from both OMSCS and OMSA programs, Professor Xie is the one rare Professors in this program which always there for you in Piazza, directly answering questions. Kudos for them.

    Highly recommended for anyone who wants to do ML jobs.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • Georgia Tech Student2020-12-07T22:27:46Zfall 2020

    The course is front loaded, expect to spend 20+ hours if your programming/math is a bit weak the first few weeks. After that, it does ease up a bit. Found myself watching YouTube videos for more intuitive explanation before watching and diving into the lectures which are notation heavy. Professor seems very active on Piazza and hosts TA sessions.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • Georgia Tech Student2020-11-29T00:45:33Zfall 2020

    This was my third course in OMSA. Professor is very sincere in her efforts. She is active on piazza. She is one of very few professors who holds weekly office hours. All TAs have weekly office hours too. Basically there is at least one office hour every day. So it is easy to ask for help if you need to understand something immediately. Grading is fair and easy to get a A with some sincere efforts. Gives a good overview and understanding of many commonly used machine learning algorithms. However initial lectures are difficult to understand. Needs some initial efforts. It is much easier if you listen to Andrew Ng's lectures to understand the concepts. There are 6 assignments and a project. The first three assignments require you to implement model code from scratch ( cannot use libraries), so the first 3 assignments are more difficult than the last three. Each assignment has mathematical concept questions as well as coding. She however gives demo codes with all the lectures which helps with the assignments. If you understand the demo codes it is easy to complete the assignments. She also gives homework explanation videos which helps with the assignments. Project can be done individually or in groups of 3. Choose your teammates wisely. Project was the most satisfying part of this class.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • Georgia Tech Student2020-11-23T04:33:34Zfall 2020

    Too many books as references, which kind of makes you not even want to touch them or look for answers in them as it makes you look forever. Frustrating. The workload is easily about 20 hours per week. The lectures are very boring and not engaging at all, which also makes you not want to watch them. Also very frustrating because getting the lectures out of the way is a dragged out agony. Hard technical things are presented in a language that you won't understand easily. If you'll take it with another heavy class - you're screwed.

    Rating: 1 / 5Difficulty: 5 / 5Workload: 20 hours / week

  • Georgia Tech Student2020-11-17T17:36:23Zfall 2020

    This was a reasonably good course. Dr. Xie is very committed to posting on Piazza and holds regular weekly office hours. The TAs offer office hours most days, so there is plenty of opportunities to get help. Despite the name, the course spends a significant amount of time on theory and mathematical proofs for various algorithms. This can be challenging. Most homework assignments include both a theoretical and practical (programming) component. The initial programming assignments are from scratch, with the most challenging being implementing the Gaussian mixture models from scratch. The later assignments allow you to use packages like scikit-learn. The course can be done with either Matlab or Python. It seemed to me that Matlab was emphasized early in the course, and Python more later. The TAs for this course were OK, but a little disorganized. They had a propensity to say “You’re overthinking”, to which I want to scream “Or you are under-explaining.” I find the “overthinking” comment infuriating. The project for this course was pretty much whatever you wanted to do ML on. I enjoyed that the best. The first and the last homeworks were the most challenging. Some of the homework assignments can be long and difficult, but can be completed in the 2 weeks. It is important to get started on the homework as soon as it is available so that you can spread the work out. There are several opportunities for extra credit. If things hold, I will complete the course with close to or slightly over 100%. This was my 8th course in the OMSA program. Again, a tough course, ok TAs, very good Professor. Good luck.

    Rating: 4 / 5Difficulty: 4 / 5Workload: 15 hours / week

  • Georgia Tech Student2020-08-12T17:31:18Zsummer 2020

    Taking this class over the summer was difficult, because it was a lot more work than anticipated. Most of the time is spent on doing the homework, and the lectures were often too high-level. The project was good, but I had a good team and we all split responsibilities evenly.

    I was hoping for a class to really get a deeper understanding of different AI/ML methods, but that was not the case. Just disappointed in how it turned out.

    Rating: 2 / 5Difficulty: 4 / 5Workload: 10 hours / week

  • Georgia Tech Student2020-08-10T13:21:23Zsummer 2020

    Taking this course in the summer is very difficult. The assignments are graded very easily, but to adequately learn all the information it will take a significant amount of time. Great concepts though. Also, I received an A

    Rating: 4 / 5Difficulty: 5 / 5Workload: 25 hours / week

  • Georgia Tech Student2020-08-04T06:17:07Zsummer 2020

    1. Difficulty: The course content is relatively harder comparing to the introductory and basic courses. However, It's very easy to get an A. In fact, it is too easy. I doubt if the true performance can be differentiated by the grades.

    2. Project: could have been due earlier so that TAs have more time look into the details and give a mark that can truly reflect the quality of work.

    3. Homework: probably you will spend most of the time working on the homework rather than trying to get a deeper understanding of the course materials. The homework does help us understand the concepts to a certain extent.

    4. Course Videos: There are a lot of theories rather than teaching the practicals, such as model selection, hyper parameter tuning. The notations used could be inconsistent sometimes and it's harder to understand them.

    5. Office Hours: There are so many of them!!! Professor and each TA has one office hour every week. I didn't attend any due to time zone difference. Also don't feel the need to watch them later as I don't know which one to watch and most of the questions will be clarified in Piazza.

    Overall, taking this course in summer can be intense. I would recommend to take it in fall or spring to have a better pace and more time to grasp all the concepts. I would also recommend to spend more time on the project, as you are dealing with a real-life problem and you could use more time to solve all sorts of problems a data scientist could face.

    Rating: 4 / 5Difficulty: 3 / 5Workload: 4 hours / week

  • Georgia Tech Student2020-08-04T02:20:35Zsummer 2020

    Its very easy to get an A in this course. Content is good and I definitely learned a lot about the math behind different ML techniques.

    Rating: 3 / 5Difficulty: 2 / 5Workload: 8 hours / week

  • Georgia Tech Student2020-07-29T16:52:18Zsummer 2020

    I enjoyed the course, but like others have said, some of the material is not presented in the best way. There's a lot of good content and I learned a lot, but most of what I learned came from exploration and readings on the internet. I found the homeworks really useful in cementing in my learning of the material and actually learning how the models/approaches worked. The lecture videos went in depth in the math behind the models, and sometimes too much so. In some places the notation was inconsistent and unclear, but ultimately those issues were smoothed out with announcements and explanations on Piazza. In the summer session we had five homeworks and a final project, with each assignment due two weeks apart. Typically I would wait until the weekend a project was due and crank out ten hours of work in 2 or 3 days, but it definitely would've been more efficient to spread my effort across the two weeks we were given. The TA's graded really easily and we got 3 one-week homework extensions (normally only 2), which made earning 80% of the grade relatively easy. The final project was pretty simple too and gave us freedom to pick a topic of our choosing. Didn't start on that until after homework 5, but it was released a few weeks into the semester so I imagine you could finish it really early if you wanted to.

    Rating: 4 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2020-05-02T18:14:09Zspring 2020

    Take-home message: Excellent contents, good workload, and easy to get an A. You can learn fundamental machine learning algorithms by doing assignments, reading books, watching YouTube, asking Professors and TAs, not by watching lecture videos.

    Content: This course covers lots of fundamental machine learning algorithms such as K-means, SVM, boosting algorithms. These algorithms concepts are always tested in interviews and you will benefit a lot by taking this course.

    Workload: It’s quite even spaced. This course has 7 biweekly assignments and 1 take-home final exam (open book, 10 days to finish).

    Scoring: You will get a high score (90+) if you put 8-hour efforts on this course.

    Course overall pros:

    1. Professor and TAs are very responsive on Piazza. The side effect is that you may spend 1-2 hours reading Piazza posts when homework is not clear.

    2. You will get solutions to all assignments. Solutions are helpful because you do not get comprehensive assignment feedback by TA.

    3. You will learn lots of math proof that pushes you to read books, ask TAs or search Google. Half of the assignments are math problems that are difficult for me but I can always find answers after several hours.

    Course overall cons:

    1. Lecture video teaches algorithms with lots of terms/math that is super difficult to understand. Basically you cannot learn algorithms by lecture videos. I highly recommend you use Piazza, Youtube, and books to learn these fundamental/popular algorithms.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 8 hours / week

  • Georgia Tech Student2020-04-27T20:16:25Zspring 2020

    My 8th class in the program. This was a good class for a survey of machine learning methods that goes into more detail than ISYE 6501 and 6040. It felt like a fusion of both classes, taking in about the level of algorithmic detail of 6040 and the modeling power of 6501.

    Pros:

    1. Not an overwhelming workload even for your average OMSA student (learned coding through MOOCs, might be rusty/underqualified on their multivariate calculus, not really sure what a subspace is or what eigenvectors have to do with anything).

    2. Many practical topics covered with practical assignments I will likely reference in the future.

    3. TAs and Professor were incredibly patient with students who do not understand the prerequisite math needed to get through proofs which for me was very important. The professor was at her office hours every week, which I haven't seen in any other courses I've taken

    Cons:

    1. Lecture videos were not great. Prof Xie does have an accent and kind of reads of slides

    Overall, this class was great but it makes me wish I invested more learning the background math needed to truly grasp topics like optimization, MLE etc..

    Rating: 4 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2020-04-26T04:20:23Zspring 2020

    TL;DR Good course, just the right amount of difficulty to stay engaged. Largely of review of ML models from the foundational courses, but goes more in depth in the math, to included proofs. Course load was perfectly balanced, as all things should be.

    Longer version: My greatest criticism of the OMSA is that there's too much overlap between courses, and this class is no exception.

    That said, the course does dive into the math a bit more, and many of the homeworks involved a fairly difficult math proof. These multivariable/linear algebra proofs were the hardest part of the course for me. Thankfully, one of our TAs walked through these proof problems in his office hours which helped me a lot. There were 7 homeworks, one due every two weeks (extended to 3 weeks once the COVID-19 pandemic started.) We were also given one 1-week extension to use even before the pandemic started. I think the bi-monthly rate is the perfect frequency of homework.

    Both the class slack channel and Piazza were also very active and helpful (Slack tends to get faster responses from peers, but use Piazza of course for contacting instructors.)

    For about the first half of the semester we were required to code the algorithms (clustering, PCA, Naive Bayes) by hand, no scikitlearn. By the second half we were allowed to use these libraries for some of the more involved models such as random forests. I thought this was a perfect balance.

    Moving on to the lectures: Prof. Xie's lectures can a little hard to understand at times due to the poor text transcription and her accent combined with the technical terms. However, I still enjoyed them, and Dr. Xie herself is very active, friendly, and helpful on Piazza. She was very accommodating to the COVID-19 pandemic, extending homework deadlines and announcing a curve on the course grade a week or two prior to the close of the semester. Obviously the COVID situation could be a one-off and she may not need to make such accommodations again, but I bring this up to prove that she's very reasonable in general.

    We only had one exam, a cumulative, open book, open note final that I found relatively easy. It was a good representation of the material covered. As long as you completed the homeworks, you should be fine.

    In summary, the coursework was well-designed, both in frequency (bi-monthly) and coverage of material. The TAs also helped with the most difficult parts. The course content is largely a review of models covered in ISYE 6501 and elsewhere, but you'll walk away with a better understanding of how the math works. Dr. Xie is a great professor, and very engaged and committed to helping her students succeed. 5/5, would take again.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2020-04-02T15:24:37Zspring 2020

    I did not like the way this course was taught. Being the one ML course in the OMSA program, I was very excited to sign up. Unfortunately as it turns out, I did not learn much from the lectures.

    Pro's: The professor and TA's are very lenient with grading, so it's not hard to get an A.

    Con's:

    While I'm well on track to get an A in this course, I don't feel like I truly learned much from the class. Most of the time, I check the videos to see what the topic is, try to watch the videos, get thoroughly confused, then move on to search for better lectures on the topic.

    The lecture videos need some cleanup - some videos not in sync with slides (i.e. the prof's words in the video not matching what was being shown on the slide).

    Homework is hard, but I feel that it is hard only because I did not learn from the lectures. Each HW took me about 10-15 hours to complete, and required lots of google searches to understand the topic.

    The lecture videos are 99% made up of the professor simply reading quickly through loads and loads of mathematical theory. A lot of times, I find myself staring at the slide and not knowing which part the prof is at.


    To be clear, I don't personally dislike Prof Xie or the TA's. I think their hearts are in the right place. But I do think this class has a long way to go in terms of improvement.

    Rating: 2 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • Georgia Tech Student2020-03-21T23:47:24Zfall 2019

    Really enjoyed the layout of the course. Professor Xie and her TAs were very helpful with the students to understand the material and help with any programming issues. They're really flexible with the course work and are very cognizant of student feedback. Since workload is a mix of math/stats theory and programming, TAs were flexible with what language you submitted your work (python, matlab, R, etc.)

    Exams were open-book and take-home. I had a particular issue where I believed I was misgraded in one question of an exam, and the TA did give a proper regrade. They weren't easy but not impossible either. If you understand the content, you'll be in good shape.

    I did gain some proper insight into the content of the coursework, and I was able to pick up some programming things as well into my day job. Overall, I'd recommend this class if you're comfortable with learning theoretical math and its applications.

    Rating: 4 / 5Difficulty: 3 / 5Workload: 15 hours / week

  • Georgia Tech Student2020-02-25T19:05:33Zfall 2019

    This was the first semester it was offered online so it did have some bugs….but overall not bad. The TA’s did a great job and were very helpful. Professor Xie (who has a heart of gold) is one of the few Professors in the program to have office hours and will work with the students to understand the concepts.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 15 hours / week

  • Georgia Tech Student2020-01-02T19:16:12Zfall 2019

    I had a lot of fun in this class. The homeworks and exams were set up well, with the intended purpose of teaching us how different algorithms work behind the scenes and/or giving us a better intuitive understanding of how they work. They were just challenging enough to be interesting, and in my opinion tended to be on the easier side. The homeworks also got easier as the semester went on. They let use whatever coding language you wanted; I (and I think most others) used Python.

    The exams were both take-home open-note and very similar to the homeworks but easier. Grading was pretty lenient on all homeworks/exams. I thought the lectures were good but not great. I'm definitely more of a learn-by-doing person, and that's how this class was set up. You could go through the whole semester without watching a single lecture and probably get just as much out of it.

    There were a couple math proofs on the first homework and on the final exam, but they weren't too difficult. I had never done a proof in my life and I was fine.

    Overall this was a fantastic class where I learned a lot and got a relatively easy A. My only complaint is I think I would've learned more had it been a little harder, but I probably should be careful what I wish for.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2019-12-13T17:11:15Zfall 2019

    Overall I loved this course. It gave great explanations and homeworks for concepts that I have often heard of in data science but didn't truly understand. You are asked to implement many algorithms yourself through homework and take home tests. There is some math and derivations but not too much. I believe that this course is a great introduction into Machine Learning concepts that goes a little deeper than other classes.

    This class was primarily conceptual though and did not go into practical applications through projects. It is different from the CS ML class in that way.

    I highly recommend this class as it is the one I enjoyed the most thus far. Homeworks and tests are all take home and you have 2 weeks to do each. They took me 5 to 15 hours to complete. The workload isn't very high for the class. Lectures are great and are given online. There are "TA cohorts" that students are assigned to and each cohort has their own office hours, making for plenty of office hour review sessions.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 8 hours / week

  • Georgia Tech Student2019-12-12T19:20:41Zfall 2019

    This course really helped me appreciate underlying concepts behind machine learning algorithms by way of teachings in this course.

    The assignments were very well prepared and made me REALLY learn, understand & apply the fundamentals behind ML techniques. Most of the assignments required researching and thinking about ideas taught in classes and not just spoon feeding and getting answers to set questions. So the course may not be for you if you are not really willing to spend extra hours researching on high-level topics taught during the course. e.g. some of the assignments required implementing ML algorithms by your own code.

    Before taking the course I had a black-box approach on ML algorithms but now my perspective is vastly different (or atleast IMHO)

    During the course, I even begun to like reading technical papers/articles on machine learning algorithms, something which I used to find very challenging before taking this course. I hope I am able to continue this in future.

    Also I found TA team working very well under extreme deadlines, while being very proactive in solving/answering queries & concerns in a very professional and humble manner.

    Formally, I found this to accurately represent what the course was about (I took this from course syllabus only) - "The course is designed to answer the most fundamental questions about machine learning: What are the most important methods to know about, and why? How can we answer the questions such as “is this method better than that one' using asymptotic theory”? How can we answer the question 'is this method better than that one' for a specific dataset of interest? What can we say about the errors our method will make on future data? What's the “right” objective function? What does it mean to be statistically rigorous?"

    Course consisted of 5 assignments (25%), 2 midterms (40%) and 1 Final (35 %).

    All the assigments were open book and I faced no issues in completing them (I only took this 1 course in Fall 2019 maybe that helped as well)

    Rating: 5 / 5Difficulty: 4 / 5Workload: 10 hours / week