Advanced Topics in Malware Analysis

4.41 / 5 rating3.35 / 5 difficulty14.94 hrs / week

Quick Facts and Resources

Something missing or incorrect? Tell us more.

Name
Advanced Topics in Malware Analysis
Listed As
CS-6747 and ECE-6747
Credit Hours
3
Available to
CS and CY students
Description
This course covers advanced approaches for the analysis of malicious software and explores recent research and unsolved problems in software protection and forensics.
Syllabus
Syllabus
  • k14TpkSTnZPbuwD/f63vXw==2025-03-24T19:49:51Zfall 2024

    I took this course my first semester in OMSCS and as a software engineer with an interest in security, this class expanded my mind. The core idea behind this course is that you can treat malware analysis as a graph/networks problem which makes programs like Ghidra possible. The lectures and labs take you through how static and dynamic binary analysis works with an emphasis on how you can construct data and control dependence graphs to prove things about what a binary does and how you can get past obfuscation and packing. It's kind of like the opposite of a compilers course in a way.

    Along the way, you have to read a few dozen papers about different areas of malware analysis and write about them. Although this was sometimes dry, it was interesting to read through all of these, and it gave me a diverse understanding of what malware vectors are out there, how they are detected, and how that detection was automated. This was probably the most time intensive part of the course for me but I think others could be more efficient about it.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 20 hours / week

  • sO8OJlQ/P8sVDM5eftGHRA==2024-12-16T19:35:19Zfall 2024

    TL;DR - This is a great, lab-only course that I think does a good job at giving you a broad understanding of foundational reverse engineering topics. In terms of difficulty I found it less difficult than GIOS, but more so than HCI or CN, so lower/mid-level difficulty. This assumes some understanding of C, C++, and assembly like what you would get from GIOS and some understanding of assembly like general purpose registers and stack operations. Without this it might be a little more difficult and time intensive.


    I thought this was a good course overall and would rate it 4.5 out of 5. The biggest issue I encountered was not receiving Gradescope feedback, which made it fairly difficult to determine whether or not I was on the right track. That said, I still managed to end with an A despite receiving lower grades on Labs 3 and 4 (I struggled with those). I made up for it by scoring a 110 on Lab 2, which I completed on my own, and doing well on labs 5 and 6.

    Lab 2 is time-consuming and it took me about 40 hours in total. However, I learned a great deal from that exercise, and completing it without a teammate is definitely feasible. I can understand how people who finish the labs quickly might find the course pace a bit slow, but if you get stuck on a lab for any reason you may end up needing the extra time.

    In terms of relative difficulty compared to other courses I’ve taken, I’d rank it as follows:

    HCI == CN < AMA < GIOS

    I think I averaged about 10–12 hours of work per week (with a maximum of ~24 hours one week), and I finished everything about three weeks early. There are no exams in this course—just labs—which was a very nice change of pace.

    My background is in cybersecurity (not malware or RE), and I had a little bit of assembly understanding before going into this as well as having recently taken GIOS. Those coming in with 0 assembly knowledge will find this more challenging, but I don't think it would be impossible.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 11 hours / week

  • cqkszj3/U5ArMQCkrnHLzQ==2024-11-15T16:48:50Zspring 2024

    This class was a really cool overview of the world of malware and how researchers try to defend against it. There are no tests! The course is organized around paper reviews and labs. The labs center on analyzing a real-world piece of malware to learn the basics of reverse engineering using the Ghidra tool. At the beginning especially, the labs can take a while, but not studying for exams makes up for it. The papers are focused on researchers trying to understand the current state of the malware ecosystem, add defenses to software, and create incident response mechanisms.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 14 hours / week

  • BAnd/PHOk18yCIGKIt8pUA==2024-05-08T14:29:21Zspring 2023

    I took the class in Spring 2024.

    TLDR: Adv Malware Analysis is just "OK". The projects can be somewhat unclear on what they want and the second half is essentially just a software analysis class, which is certainly relevant to malware reverse engineering but not specific to it, if that makes sense.

    I took this class hoping to learn about how advanced malware works, and I didn't really get that. I did get good in depth experience with x86 assembly and an OK introduction to Ghidra.

    The lectures are OK, they are generally relevant to the projects but I didn't consider them all that interesting with respect to learning about malware.

    Project 2 is the difficult and long one, I highly recommend getting a partner AND setting up a joint Ghidra server. If you don't it will be difficult to combine your work which will make an already time consuming project take even longer. Thankfully my partner set it up so I didn't have to deal with it, but it seems like it's not too difficult. I also recommend setting up Ghidra on your own box in some capacity because the provided AWS Workspaces are kind of trash (but still workable).

    The rest of the projects were not as interesting in my opinion and were mostly tedious. A few had unclear requirements, which I think is partially an attempt by the instructors to be forgiving to different methods of accomplishing the tasks but ends up being confusing. You'll write Ghidra plugins and do a lot of work with control flow graphs.

    The paper summaries are pretty easy to knock out in two to three days, you can get by with reading the abstracts + intros + conclusions.

    Overall a fine medium difficulty class to take, and I'm hoping that the x86 and Ghidra experience will pay off in CS6265 Binary Exploitation.

    Rating: 3 / 5Difficulty: 3 / 5Workload: 15 hours / week

  • gubDUC5idd/gRNxUlv4AMw==2023-11-27T23:16:50Zspring 2023

    The course has 2 weighted categories, 10% of the grade comes from short summaries of assigned reading material that is compiled into a Powerpoint for submission. Even if you fall behind, its not hard to get caught up and get full credit. They claim to also base this 10% on participation in Ed, but I don't remember participating much, and still got full points for this section. The second section was projects. The first half of hte projects is about reverse engineering, and writing Ghidra plugins. The second half of the projects is about Dynamic analysis. Both sections were very time consuming, and there weren't exceptionally clear guidelines for how to determine when you had done sufficient for each project. Fortunately, there was some extra credit available for the earlier projects that was able to balance out some of the shortcomings on later projects.

    I probably spent >20 hours on each project, but I felt that I learned a lot while doing it.

    I got an A in this class, but it wasn't easy!

    Rating: 4 / 5Difficulty: 5 / 5Workload: 18 hours / week

  • vb1q3wd0lmdQ+nCx4fREaQ==2023-11-15T00:31:17Zfall 2023

    This might be the most arbitrary class I've ever taken. The projects are horrible. The documentation never tell you exactly what they want and there are clear issues with autograders being incorrect and it is up to you to prove to the TAs that it is wrong and that you are right. They want very specific requirements for projects and unless you clarify them in Ed Discussion for what they are specifically looking for you will get them wrong.

    For one example, every single instruction of a malware changes the EIP register. If you claim that then they call you wrong for it because "adding it to each instruction would be too much work". Are you kidding me? That's part of the horrible grading decisions that we needed to deal with in order to get a good grade. It's sad because the content is great, but the projects are some of the worst I have ever worked on.

    Rating: 3 / 5Difficulty: 3 / 5Workload: 20 hours / week

  • 8MY5UH9bQUcNzUOoFDzcgQ==2023-11-15T00:14:38Zfall 2023

    I took this class because of other reviews on here stating that this was overall a great and enjoyable class. However, that was not my experience.

    Overall, the class is well structured and paced. However, basically all of the "modules", which are essentially a series of 5-15 minute videos for that particular lesson, are dull, surface level discussions of content that is vaguely related to the assignment.

    The assignments themselves are not difficult. They are however just pure busy work with little to no critical thinking. For a master's level course, I'd expect some additional information outside of an introductory course. All of the work is essentially an introduction to binary analysis tools and the assembly language. The only "malware" part of this class is some discussions about malware techniques in the videos, as well as the fact that you are analysing a real-life malware sample. However, the assignments themselves are just tests on your ability to use binary analysis tools (GHIDRA, Intel's PIN, and the ability to decipher assembly).

    The main praise I have for this class, is that it at least knows its mostly busy work, and as such, allows for teams of two.

    As previous reviews have stated: Assignment 1: You commented the purpose of a handful of assembly lines in GHIDRA. This took me about 5 minutes. Assignment 2: This is the large "busy work" assignment. You basically comment a few hundred lines of assembly code from a malware binary. This may be intimidating for those with no assembly background, however, with GHIDRA's C disassembler and the WINAPI, it is trivial to do. This assignment took the longest at a combined 15-20 hours between my teammate and I. Assignment 3: You write a GHIDRA plugin to create a definition and use dependency graph for the same malware sample. Overall, this assignment took 5-6 hours. This required maybe 100 lines of code. Assignment 4: You write a GHIDRA plugin to create a data dependence graph for the same malware sample. This assignment is based off the code written for Assignment 3, with a few tweaks, doubling the amount of lines of code. This took 8 hours total. Assignment 5: You use Intel's PIN tool to dynamically analyse the same malware sample. You trace every possible control flow that you can. We had a bug in our code, which caused gradescope issues, so this assignment took a little bit longer than we had expected, coming in at 15 hours combined. Assignment 6: This assignment just started, however it does not appear to be any more difficult than the previous assignments. You are to build a dynamic control dependence graph using Intel's PIN tool.

    Along with these assignments, you have to read ~30 research papers that are 15-20 pages long each. This was just additional busy work. I enjoyed a few of them, but many are outdated.

    Before assignment 6, I have over 100% due to a ton of extra credit allotted in this class.

    Overall, the class would be good for those trying to learn basic assembly or binary analysis tools. For those seeking to learn "advanced" malware analysis, it is a bust. In terms of difficulty, I'd placed it at around a 300 level undergraduate computer engineering course, just because there is no hand holding and I could see it being tough for those that have never touched assembly and may not be strong programmers.

    Rating: 3 / 5Difficulty: 2 / 5Workload: 10 hours / week

  • 6uoIlaVTbA6M8VCZ79JTAw==2023-08-01T19:00:56Zsummer 2023

    I really enjoyed this course and regret taking it in the Summer; in retrospect, I wish I had decided to take this course in the Spring or Fall so as to have had more time to more thoroughly learn/digest the material. While it's certainly doable in the Summer as far as difficulty goes, having 4 fewer weeks to cover the material means inevitably you have to cut corners in order to complete the 6 projects and summarize the 32 academic papers in time. If you're interested in the material, it's worth giving it that extra time.

    Highlights of the course:

    • Project 1: easiest one of the bunch. If you're familiar with Ghidra and Assembly (which is likewise touched on in several other security-centric courses, particularly CS6265), it's a half day's distracted effort at most.
    • Project 2: one of the longest of the bunch to complete. The work isn't particularly difficult to do, but you do have to make nearly 2k comment annotations in a real-world malware sample; this takes time.
    • Project 3: this one required consulting external documentation the most out of the bunch. You're drafting a Ghidra plugin from scratch, which means consulting both the Ghidra API (in either Java or Python) to figure out calls as well as Intel x86 assembly language references to determine opcode behavior.
    • Project 4: this one is a logical extension of Project 3 as another Ghidra plugin. Assuming you understood the behavior of the aforementioned opcodes, you end up tracking which instructions adjust data used by latter instructions.
    • Project 5: this was one of the shortest of the bunch. Arguably, I think this project should have been after project 2 and before project 3; in this one, you're having the malware execute and - through the use of Intel's PIN tool implemented in C++ code - tracing the order of instructions executed. Intel's PIN tool comes with a plethora of examples to pull from, which makes this project trivial.
    • Project 6: this one was the hardest of the bunch, both in its time-intensiveness and complexity. Though you extend off of your work in Project 5, you need to understand how to find immediate post dominators in Graph Theory (and then ultimately control dependencies). Doing this efficiently (by way of big-O) and dynamically (i.e. not hardcoding values into your project) creates an added layer of difficulty. It was an appropriate send-off.
    • 32 papers: 10% of your grade is beholden to submitting short summaries of 32 papers that are all made available at the start of the semester. Their content is adjacent to your projects (i.e. related to actual malware analysis research, but in no way contributes to your comprehension of the assigned coursework). Invariably, students end up delaying on reading them and then panic to catch up at the end of the semester. Try to do 2-3 a week if able.
    • Partners: you're permitted to split the labor of all the projects (beginning at project 2) with one other peer and share the resulting grade. I didn't, which likely contributes to my aforementioned bemoaning of not having enough time to actually learn/digest the material.
    • Staff engagement: as in most OMSCS classes, the TAs are the primary source of engagement between the students and staff. The head TA was the most involved in Ed Discussion by far; I saw no TAs in the associated Slack chat at any point during the semester. Outside of the pre-recorded lectures, office hours, and the occasional meta-comment (e.g. end-of-term CIOS reminder), the Prof's presence went largely unnoticed (though to be fair, when reached out to directly, the professor's response[s] to my inquiries were timely and the head TA's coverage of Ed Discussion was generally sufficient and knowledgeable). This is consistent with my broader OMSCS experience and - relative to other courses taken - I felt adequately supported.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • zu4BUCiKeMDLd3GXVlg9RQ==2023-07-18T03:12:14Zspring 2023

    My first course in OCY - definitely a must-take. The course covers both static analysis and dynamic analysis for the malware analysis. Need prior knowledge in C and Python and you'll need to learn x86 assembly on-the-go. 6 projects - grading is generous

    Rating: 5 / 5Difficulty: 3 / 5Workload: 1 hours / week

  • fLUYzX6OPADoOahqJKPRLQ==2023-06-18T16:41:29Zspring 2023

    This course is a lot of fun! The TA’s and Professor were fantastic too.

    This class largely involves creating plugins and scripts to automate malware analysis, usually involving creating graph file of some sort which represents some aspect of the malware’s execution path whether it be data or control-flow based. Grading is based off the accuracy of this output, but it is fairly lenient and there’s a few opportunities for extra credit so keeping an A shouldn’t be too difficult. If you have any experience programming or reading api documentation, these labs will not be too challenging. For most labs, you’re given a couple weeks to complete them. A strong understanding of assembly would be a great help for these, but if you don’t have this going in you more than likely will coming out.

    There’s also a handful of research papers on the topic of binary analysis each week which you’re meant to read and summarize in a set of slides. There’s no status checks and it’s due at the end of the semester so you could wait until the end and do them all after completing the labs, but the papers really add up so unless you want to spend a couple days straight writing summaries I’d highly recommend against doing that.

    Overall, the subject matter is very interesting in my opinion, but there’s one downside in that you spend most of the class analyzing one sample of malware. Given the lab content I understand why this is the case, but it does feel a bit repetitive at some point. Otherwise, I’ve got no complaints.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 12 hours / week

  • Georgia Tech Student2022-03-20T00:12:10Zfall 2021

    Lectures were okay, but the projects were pretty interesting!

    Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2021-08-13T08:59:25Zsummer 2021

    I could describe this course with two words fun & engaging. After finishing the course I had the feeling I learned a ton of new things.

    Projects:

    • Lab1: ~2hrs -> 100. Leave comments in ghidra in "hello world" program.
    • Lab2: ~60hrs -> 100. Do the same thing as Lab1 but in a real malware.
    • Lab3: ~20hrs -> 100. Build a python/java script for static analysis using ghidra. Ghidra API has everything you need.
    • Lab4: ~30hrs -> 95. Extend Lab3 to calculate data dependencies with static analysis. Knowledge in data structures are useful.
    • Lab5: ~20hrs-> 100. Build a control flow graph using dynamic analysis with PIN. Again PIN API and online examples can lead you to victory.
    • Lab6: CANCELLED

    Reading Slides:

    DO NOT LEAVE IT FOR THE LAST MINUTE. I spent 5 days in August reading 33 papers. I got 100.

    Course Material:

    Really nice content. Super helpful Office Hours. Very fast piazza responses by the TA

    General Review.

    Everyone interested in security should take this class. Also there are a lot of EXTRA CREDITS in case you feel you won't make it. I think that anyone who give something to this course will get away with a good grade. I got an A.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 15 hours / week

  • Georgia Tech Student2021-08-10T17:51:28Zsummer 2021

    This is the most well organized class I've ever taken, the professor and Chow were very responsive and they interacted with the class. The office hours were informative and fun, even if you didn't need help. I was worried this class would push me to the edge with it being an advanced topic and a compressed semester, and I was right...but I don't regret it since I learned so much. This is what a graduate class is supposed to be like!

    There is a reading slide assignment and 6 labs, they dropped the sixth lab for our semester due to technical and scheduling problems. No tests or quizzes. I recommend doing one or two module PDFs per week to stay on top of them, I put them all off until the end of the semester and my brain was exhausted.

    The first two labs require reading ASM and C++ to understand what the malware is doing, definitely read the provided PDF if you aren't familiar with ASM since you're going to work with it a lot in this course. The last 4 require coding skills in C++ (PIN) and Java/Python (Ghidra), I highly recommend getting a partner. If you aren't technical, this may not be a good course for you as everything is pure application and addressing corner cases in code.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 20 hours / week

  • Georgia Tech Student2021-08-10T17:01:09Zsummer 2021

    I was lucky enough to have a great partner who saved me for the coding project. I'm not a developer and do not have a CS background. Project 2 was commenting a lot of ASM. It wasn't all that hard, but just took a very long time.

    I think this was the first time the course was offered during the summer and there were some scheduling issues with projects. During the normal semester, there was no overlap. However we had times where a project was still outstanding when the next project was released. The final project 6, ended up being canceled due to issues with the lab not working and the students, TA & prof had to work together to get a working VM.

    The TA & prof were very involved. The TA Chow would sometimes answer stuff in minutes. Both the TA & Prof office hours were very informative.

    Rating: 3 / 5Difficulty: 4 / 5Workload: 20 hours / week

  • Georgia Tech Student2021-08-10T14:58:06Zsummer 2021

    Depending on your background, and if you do or don't have a partner (groups of 2 are allowed but not forced), this course could take significantly more time, but if you're familiar with general assemly language constructs/logic flows/stack operations then this course is fairly straightforward. You get a really good dive into a windows malware and get to use Ghidra and PIN to analyze the same binary throughout the course in different ways and learn different methodologies for analyzing the binaries. No tests or quizzes makes this a 10/10 course. If you put in the effort and make sure you at least reasonably understand the material there's no reason you won't get an A. Professor is very engaging and clearly passionate about the work

    Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week

  • Georgia Tech Student2021-01-16T18:08:29Zfall 2020

    I thought this class was difficult didn't focus much on malware analysis. The only thing that makes it a malware class is that the software sample you are working on is malicious. You could do the exact same projects to any other executable and the content could be just about the same. This course should probably be named "advanced control flow toolkit development". That being said, you can use these skills for malware analysis, but I just had a different expectation before registering.

    You have the option to partner up for each of the assignments. I would HIGHLY recommend syncing up with someone early and work through the semester with that same partner. Projects 3&4, and 5&6 build off each other so it wouldn't be useful or advised to change partners on those.

    I'm intermediate with python and never used C++ before, and if it wasn't for my partner, I would have dropped. I wouldn't register for this class unless you are at least a strong programmer in python. Knowing how to read assembly before hand is also a must.

    Grading: projects are worth 90% of the grade. The other 10% is piazza participation and creating a slide deck for the weekly readings. No textbook required and no quizzes or exams.

    Project 1: Learn ghidra interface. Review a hello world program in ghidra and add comments as to what each asm instruction is doing.

    Project 2: Add ghidra comments to a real malicious sample. The samples has around 30 functions, so there was a ton of assembly to review. This was super time consuming and I don't know how you would work through this without a partner.

    Project 3: Create a Def Use ghidra plugin. Write a ghidra tool in python or java to follow how each register is updated. Output results into a .dot graph.

    Project 4: Create a data dependence ghidra plugin that tracks which assembly instructions are dependent on previous instructions. Output results into a .dot graph.

    **Note about the projects 5 and 6. The professor recommended using Intel's PINtool but writing another ghidra plugin was acceptable. For those not familiar with PIN, it is written in C++. If you aren't familiar with C++, you can stick with python which is nice.

    Project 5: Write a PIN/Ghidra Dynamic Control Flow tool. Track the execution path of an executable and output the flow to a .dot file.

    Project 6: Create a PIN/Ghidra dynamic control dependence. Track the dependence of functions during execution.

    Pros: The professor and TA's is very engaged and obviously well versed on the subject. It made it easy to learn a lot from them. I feel very confident with assembly after working through this course.

    Cons: I was hoping the class would involve more use of debuggers, dumping objects from memory, unpacking binaries etc. It ended up basically just being software development for Ghidra and PIN.

    Summary: The class was very challenging and enjoyable. I probably invested at least 20 hours into each project but you have plenty of time to complete them. I would take it again.

    Rating: 4 / 5Difficulty: 5 / 5Workload: 25 hours / week

  • Georgia Tech Student2020-12-22T15:59:05Zfall 2020

    Excellent course covering reverse engineering/binary analysis techniques.

    Topics include understanding assembly, reversing tools such as Ghidra, and others approaches to extract structured info from a binary executable. This course focuses heavily on reverse engineering techniques (Control Flow Graph extraction, data tracking analysis, static vs dynamic analysis, etc.) and does NOT focus on pre-built/automated tools which abstract away the analysis details such as an automated sandbox like Cuckoo, nor does it focus much on OS specific attack points.

    Lecture materials are well organized and things flow very nicely. Projects are aligned with the lecture materials. Lectures are interesting and present complex materials in a clear manner. Projects are very technical in nature, but clearly stated. Python/Java, and coding in C was required. I had just a tiny bit of experience with writing C, and did fine; YMMV.

    Overall this has been my favorite class in the OCY program thus far :). If you believe you may be interested in binary analysis or reverse engineering, then I highly recommend this class. I rank this as hard because extracting data from assembly is a challenging task.

    Rating: 5 / 5Difficulty: 4 / 5Workload: 13 hours / week