This is indicative of two things.
1. While I can't stand the guy, ya'll need to watch Peter Thiel's talk from 10-15 years ago at Stanford about not building the same thing everyone else is, a la, the obvious thing.
2. People are really attracted to using LLMs on deep thinking tasks, off shoring their thinking, to a "Think for me SaaS". This won't end well for you, there's no shortcuts in life that don't come with a (huge) cost.
The person who showed their work and scored A's on math tests instead of just learning how to use a calculator, is better off in their career/endevours than the 80% of others who did the latter. If Laurie Wired makes an MCP for Ghirda and uses it that's one thing, you using it without ever reverse engineering extensively is completely different. I'd bet my bottom dollar that Laurie Wired doesn't prefer the MCP over her own mental processes 8/10 times.
Are you just trying to argue for the sake of arguing?
Being first and the winner requires a lot to line up, so it shouldn't be the only, default, or best setting. Pursuing this is optimizing.
Also a message from 10-15 years ago might not reflect the same context as today.
In other words, what's good for Peter Theil might not be goid for you.
It's a relocatable object file exporter that supports x86/MIPS and ELF/COFF. In other words, it can delink any program selection and you can reuse the bits for various use-cases, including making new programs Mad Max-style.
It carved itself a niche in the Windows decompilation community, used alongside objdiff or decomp.me.
I've done a case study where I've ported a Linux a.out program into a native Windows PE program without source code: https://boricj.net/atari-jaguar-sdk/2023/11/27/introduction....
Another case study was ripping the archive code from a PlayStation game and stuffing it into a Linux MIPS program to create an asset extractor: https://boricj.net/tenchu1/2024/03/18/part-6.html
Being able to hook Claude code up to this has made reversing way more productive. Highly recommend!
I guess one issue I have is that I don't have good ideas of fun projects, and that's probably something I need to actually get the motivation to learn. I can find a "hello world", that's easy, but it won't help me get an idea of what I could reverse engineer in my life.
For instance I have a smartspeaker that I would like to hack (being able to run my own software on it, for fun), but I don't know if it is a good candidate for reverse engineering... I guess I would first need to find a security flaw in order to access the OS? Or flash my own OS (hoping that it's a Linux running there), but then I would probably want to extract binary blobs that work with the buttons and the actual speaker?
The reverse engineering I've learned has generally been to fix something that has annoyed me - for example I reverse engineered part of RCT3 to fix mouse input with high poll rates and allow for resizable windows [0]. Certainly easier to approach than trying to get into a closed device since you can attach a debugger.
One thing which really helped me (and I wholeheartedly recommend) is to write simple programs, run them through the compiler and then in the disassembler. It really helps build a correspondence between program structure and its object code.
Eventually, you can make it even more fun and challenging by stripping debug symbols and turning on compiler optimisations.
Happy reversing!
The good news is that there has never been MORE resources out there. If you want to use this learning expedition as an excuse to also build up a small electronics lab then $100 on ali express to buy whatever looks cheap and interesting and then tear it apart and start poking around to find where the firmware lives. Pull the firmware, examine it, modify it and put it back :)
This guy has a discord server with a specific "book club" section where they all choose a cheap $thing and reverse engineer it: https://www.youtube.com/@mattbrwn/about
I can't help much with "traditional" app/software RE work, sorry.
Thanks a lot!
Turns out that frontier grade llms are absolutely fantastic for extremely advanced static analysis. If you go one step further and manage to get your firmware running inside of an emulator or other place where you can attach GDB... Then putting an mCP server on that as well unlocks so much insane potential.
The nightmare course explicitly talks about how to use Ghidra.
1: https://guyinatuxedo.github.io 2: https://www.roppers.org
from then you can use things like Ghidra (which supports a lot of those old CPU arches) for more advanced analysis and make the game do almost whatever the hell you want if you have the patience.
I think a lot of the skills will transfer quite well (obviously not 1:1, you will need to learn some things) to the more employable side of RE if that's what you're interested in
I guess I'm struggling to transfer that to "real-life" scenarios. Like getting something useful out of reverse engineering (getting infinite lives is interesting to see that I can tamper with the game, but it's not exactly useful).
(Thinking more of license-checking, and serial-number generation rather than infinite lives.)
The book is designed for beginner and advance users.
So for the second thing, pulling the data off chips like that typically involves some specialized hardware, and you have to potentially deal with a bunch of cryptographic safeguards to read from the chip’s memory. Not impossible though, and there are not always good safeguards, but might be worth checking out some simpler programs and working up to it, or learning some basic hardware hacking to get an idea of how that process works.
Well that may explain it, then, thanks for letting me know.
I realise that my question was not super clear because... well I didn't really know what to ask :-). I was just trying to engage in a human interaction. Say I am at a party with friends and strangers, and when I get introduced to a stranger, they say "I am a professional reverse engineer". Because I find that interesting, I will start asking questions. And I may well start trying to explain what I find interesting, giving the expert an angle to start talking about it.
Of course I could just go home and read about reverse engineering. But at that moment, in that party, I want to enjoy a discussion about it with a human being. Part of the experience is that I get to hear what some other human thinks about it.
I am not there for a formal course, I am there to listen to what a human being has to say about it. And obviously an LLM cannot do that job :-).
I think we should conclude people want to maximize learning while minimizing wasted time, hence they ask for the "best resources". Even though the question seems tiring at times (when I was on reddit I heard this constantly, and cynically projected that very few people actually used the resources they requested. But I solved this problem by quitting/getting banned from Reddit and never looked back).
I can explain my intent, since I asked the question :-).
"Signal interest in something in the hope of starting a discussion with people who share that interest and may have interesting stories to share".
I loved IRC for that. I could join a channel, ask a question and sometimes someone knowledgeable would engage in a discussion with me. Often nobody answered, but because IRC was "ephemeral", I could ask again another time, and another one, hoping to eventually find someone interested.
> I think we should conclude people want to maximize learning while minimizing wasted time
In my case (and I want to believe that in many other cases), it's really just that people (me, here) would like to have some human interaction about a topic.
I know how to learn, I was not asking about that. I was trying to start a conversation with humans, that's all.
Totally fair, and I'm sorry you got a hostile response.
My (very low-value) opinion is don't waste your time learning how exploits work. Yeah it's kinda neat seeing clever misuse of components. But there is very little upside to investing in that knowledge.
0. You look at old exploits and marvel at them for a while, but they are long ago patched and technically useless.
1. You waste a bunch of time looking for a sploit but don't find one.
2. You find one but nobody cares, you don't get street cred. The sploit is patched in the next release, and you don't get back your time spent finding it.
3. You find a sploit but all you get is a thanks from the billion dollar company, followed by a patch.
4. You create an exploit and use it maliciously or sell it to a criminal syndicate. you are a criminal. Or you get sued because it's a civil/copyright issue.
5. You find a sploit and other people treat you as a criminal even though you didn't do anything with it. You even intended to help.
6. You find sploits but still can't get a job as a white hat because other people who found more sploits got the job.
The only good outcomes are:
7. You found a very clever sploit and got a bounty for it.
8. You got hired in cyber security and get paid for sploits or countering them.
9. You seriously just love decoding machine instructions and find joy from making it do unintended things.
Overall, I think the risk/reward ratio is suboptimal for this field unless you go black-hat which is obviously fraught with moral and legal hazards.
Oh wait... Right.
Asking for resources or asking "does anyone know where I can start?" Followed by a description of "here's where I'm at" has been table stakes for the uninitiated since time immemorial.
When I see "ask the LLM", all I hear is "prop up my investment portfolio".
To this OP in particular: try playing around with different binaries you already have source to, and using the RE tools to get a feel for their post compilation structure and flow; start by compiling with no compiler optimization. You'll want an understanding of what the structural primitives of "nothing up my sleeve" code reads and looks like post-compilation to build off of. Then start enabling different layers of optimization, again, to continue familiarizing yourself with output of modern compilers when dealing with fundamentally "honest" code.
Once you can eyeball things and get an intuitive sense for that sort of thing is where you jump off into dealing with dishonest code. Stuff put through obfuscators. Stuff designed to work in ways that hide what the actual intent of the code is, or things designed in ways that make it clear that the author had something up their sleeve.
It'll be a lot of work and memorization and pattern recognition building, and you'll have to put in the effort to get to know the hardware and memory architecture, and opcodes and ISA's, and virtual machines you're reversing for, but it will click eventually.
Just remember; odds are it won't make you money, and it will set time on fire. I cut my teeth on reversing some security firm's snake oil, and just trying to figure out why the code I wrote was acting weird after the compiler got done with it. (I have cursed at more compiler writers than about anyone but myself).
Then just remember that if someone got it to run, then it's gotta eventually make sense. The rest is all persistence on your part of laying bare their true, usually perverted motivations (generally boiling down to greed, job security, or wasting your goddamn time).
Would the world be nicer if that wasn't the case? Absolutely. I lived through a period where a lot of code wasn't "something up my sleeve" code. Now is not so much that time anymore. We've made programming too accessible to business types that now the interests of organization's at securing their power has a non-trivial distortion on how code gets written; which generally means user hostile in one way or another.
Even pre llm, there was a clear indicator of someone who was skilled at coding versus someone who was not. The big thing that differentiated people was curiosity. When someone is curious, they would go look stuff up, experiment, figure out how to build things by failing over and over again, and eventually they would figure it out, but consequently, they have learned quite a lot more along the way.
And then there were people that were just following instructions, who in interviews though that them following instructions was virtue worthy.
Nowdays, this is even easier to tell who is who, because LLMs essentially shortcut that curiosity for you. You don't have to dig through the internet and play around with sandbox code, you can just ask an LLM and it will give you answers.
This is why I specifically said if you are hesitant of starting with LLMs, you should learn how to learn first, which usually starts with learning how to ask questions.
In my opinion, it is extremely important for the interviewer to realise that they are in a dominant position. Here, I can tell you what I think about how you judged me. If I was an interviewee, I may not be in a position to lose the job just because I told you that you are being rude.
Anyway, I would recommend YouTube. Find a series you can follow along. Best of luck!
It works surprisingly nicely with AI agents (I mean, like Cursor or Claude Code, I don't let it run autonomously!).
Here on detecting malware in binaries (https://quesma.com/blog/introducing-binaryaudit/). I am now in process of recompiling and old game Chromatron, from PowerPC binary to Apple Silicon and WASM (https://p.migdal.pl/chromatron-recompiled/, ready to play, might be still rough edges).
It's difficult to be an AI doomer when you see stuff like this.
As well as the research history (slated to be updated in a few days): https://mahaloz.re/dec-progress-2024
I've used IDA, Ghidra, and Binary Ninja a lot over the years. At this point I much prefer Binary Ninja for the task of building up an understanding of large binaries with many thousands of types and functions. It also doesn't hurt that its UI/UX feel like something out of this century, and it's very easy to automate using Python scripts.
Binary Ninja – an interactive decompiler, disassembler, debugger - https://news.ycombinator.com/item?id=41297124 - Aug 2024 (1 comment)
Binary Ninja – 4.0: Dorsai - https://news.ycombinator.com/item?id=39546731 - Feb 2024 (1 comment)
Binary Ninja 3.0: The Next Chapter - https://news.ycombinator.com/item?id=30109122 - Jan 2022 (1 comment)
Binary Ninja – A new kind of reversing platform - https://news.ycombinator.com/item?id=12240209 - Aug 2016 (56 comments)
Can't speak to this as I don't RE for security purposes, but:
> no plugin support and rather limited IR.
this I'm profoundly confused by. BN has multiple IRs that are easily accessible both in the UI and to scripts. And it certainly has a plugin system too.
https://github.com/Vector35/binaryninja-api/releases/downloa...
I once tried learning how to RE with radare2 but got very frustrated by frequent project file corruption (meaning radare2 could no longer open it). The way these project files work(ed?) in radare2 at the time was that it just saved all the commands you executed, instead of the state. This was brittle, in my experience.
I don't have a lot of free time, so I have to leave projects for long periods of time, not being able to restart from a previous checkpoints meant I never actually got further.
IIUC, one of the first things Rizin did was focus on saving the actual state, and backwards/forwards-compatibility. This fact alone made me switch to Rizin. To its credit, my 3-year old project file still works!
Now for the downside: there is apparently a gap in Windows (32-bit) PE support, causing stack variables to be poorly discovered: https://github.com/rizinorg/rizin/issues/4608. I tested this on radare2, which does not have this bug. I'm hoping this gets fixed in Rizin at some point, at which point I'll continue my RE adventure. Or maybe I should give an AI reverse engineer a try... (https://news.ycombinator.com/item?id=46846101).
(Funny expression, that. I'll wait, of course. It'll be a happy day when this works again and I can slowly make progress RE'ing again.)
Your corruption frequency anecdote matches mine. I don't have the mental werewithal to deal with that. I won't go back to radare2 until they change their project file stability somehow.
For embedded IDA is very ergonomic still, but since it’s not abstract in the way Ghidra is, the decompiler only works on select platforms.
Ghidra’s architecture lends itself to really powerful automation tricks since you can basically step through the program from your plugin without having an actual debug target, no matter the architecture. With the rise of LLMs, this is a big edge for Ghidra as it’s more flexible and easier to hook into to build tools.
The overall Ghidra plugin programming story has been catching up; it’s always been more modular than IDA but in the past it was too Java oriented to be fun for most people, but the Python bindings are a lot better now. IDA scripting has been quite good for a long time so there’s a good corpus of plugins out there too.
(not if you're only doing x86/ARM stuff, though)
I was recently trying to analyse a 600mb exe (denuvo/similar). I wasted a week after ghidra crashed 30h+ in multiple times. A seperate project with a 300mb exe took about 5h, so there's some horrible scaling going on. So I tried out Ida for the first time, and it finished in less than an hour. Faced with having decomp vs not, I started learning how to use it.
So first difference, given the above, Ida is far far better at interrupting tasks/crash recovery. Every time ghidra crashed I was left with nothing, when Ida crashes you get a prompt to recover from autosave. Even if you don't crash, in general it feels like Ida will let you interrupt a task and still get partial results which you might even be able to pick back up from later, while ghidra just leaves you with nothing.
In terms of pure decomp quality, I don't really think either wins, decomp is always awkward, it's awkward in different ways for each. I prefer ghidra's, but that might just be because I've used it much longer. Ida does do better at suggesting function/variable names - if a variable is passed to a bunch of functions taking a GameManager*, it might automatically call it game_manager.
When defining types, I far prefer ida's approach of just letting me write C/C++. Ghidra's struct editor is awkward, and I've never worked out a good way of dealing with inheritance. For defining functions/args on the other hand, while Ida gives you a raw text box it just doesn't let you change some things? There I prefer the way ghidra does it, I especially like it showing what registers each arg is assigned to.
Another big difference I've noticed between the two is ghidra seems to operate on more of a push model, while Ida is more of a pull model - i.e. when you make a change, ghidra tends to hang for a second propagating it to everything referencing it, while Ida tries pulling the latest version when you look at the reference? I have no idea if this is how they actually work internally, it's just what it feels like. Ida's pull model is a lot more responsive on a large exe, however multiple times I've had some decomp not update after editing one of the functions it called.
Overall, I find Ida's probably slightly better. I'm not about to pay for Ida pro though, and I'm really uneasy about how it uploads all my executables to do decomp. While at the same time, ghidra is proper FOSS, and gives comparable results (for small executables). So I'll probably stick with ghidra where I can.
During the startup auto analysis? For large binaries it makes sense to dial back the number of analysis passes and only trigger them if you really need them, manually, one by one. You also get to save in between different passes.
I figured I probably could remove some passes, but being a lite user I don't really know/didn't want to spend the time learning how important each one is and how long they take. Ida's defaults were just better.
Ghidra is the better tool if you're dealing with exotic architectures, even ones that you need to implement support for yourself. That's because any architecture that you have a full SLEIGH definition for will get decompilation output for free. It might not be the best decompiler out there, sure, but for some architectures it's the only decompiler available.
Both are generally shit UX wise and take time to learn. I've mostly switched from IDA to Ghidra a while back which felt like pulling teeth. Now when I sometimes go back to IDA it feels like pulling teeth.
- AVR
- Z80
- HC08
- 8051
- Tricore
- Xtensa
- WebAssembly
- Apple/Samsung S5L87xx NAND controller command sequencer VLIW (custom SLEIGH)
And probably more that I've forgotten.It's also not about lack of support, but the fact that you have to pay extra for every single decompiler. This sucks if you're analyzing a wide variety of targets because of the kind of work you do.
IDA also struggles with disasm for Harvard architectures which tend to make up a bulk of what I analyze - it's all faked around synthetic relocations. Ghidra has native support for multiple address spaces.
Maybe we need to get some good cracked^Wcommunity releases of Binja so that we can all test it as thoroughly as IDA. The limited free version doesn't cut it unfortunately - if I can't test it on what I actually want to use it for, it's not a good test.
(also it doesn't have collaborative analysis in anything but the 'call us' enterprise plan)
https://www.youtube.com/watch?v=d7qVlf81fKA&list=PL4X0K6ZbXh...
(#3 forward uses Ghidra)
It worked fine in Ubuntu and Windows. The interface takes some getting used to, but paired with Bless Unofficial (using snap to install), it makes reverse engineering smooth.
I was a special agent with an org involved in similar work. They put me through 7 SANS courses, including paying for 5 certs, in 18 months.
(Btw, these links are just for anyone curious to read more - reposts are fine after a year or so - https://news.ycombinator.com/newsfaq.html)
NSA Ghidra open-source reverse engineering framework - https://news.ycombinator.com/item?id=40508777 - May 2024 (61 comments)
Ghidra 11.0 Released - https://news.ycombinator.com/item?id=38740793 - Dec 2023 (11 comments)
Ghidra 10.3 has been released - https://news.ycombinator.com/item?id=35908418 - May 2023 (6 comments)
NSA Ghidra software reverse engineering framework - https://news.ycombinator.com/item?id=35324380 - March 2023 (103 comments)
Ghidra: Software reverse engineering suite developed by NSA - https://news.ycombinator.com/item?id=33226050 - Oct 2022 (42 comments)
Ghidra: A software reverse engineering suite of tools developed by the NSA - https://news.ycombinator.com/item?id=27818492 - July 2021 (142 comments)
Ghidra 9.2 - https://news.ycombinator.com/item?id=25086519 - Nov 2020 (78 comments)
The Ghidra Book - https://news.ycombinator.com/item?id=24879314 - Oct 2020 (5 comments)
Ghidra Decompiler Analysis Engine - https://news.ycombinator.com/item?id=19599314 - April 2019 (30 comments)
Ghidra source code officially released - https://news.ycombinator.com/item?id=19572994 - April 2019 (7 comments)
Ghidra Capabilities – Get Your Free NSA Reverse Engineering Tool [pdf] - https://news.ycombinator.com/item?id=19319385 - March 2019 (17 comments)
Ghidra, NSA's reverse-engineering tool - https://news.ycombinator.com/item?id=19315273 - March 2019 (405 comments)
Ghidra - https://news.ycombinator.com/item?id=19239727 - Feb 2019 (59 comments)
NSA to Release Their Reverse Engineering Framework GHIDRA to Public at RSA - https://news.ycombinator.com/item?id=18828083 - Jan 2019 (90 comments)
It's certainly not the first thing they've released (selinux, for one, and then all the other repos in the account), but this repo showing up on HN, with a prominent call-to-action to look at a career with them, is a great way to target the applicants you want ("those who would find this project interesting, because it's just the sort of thing we need them to work on")
Atlassian used to do (maybe still does) this in bitbucket if you open dev tools - a link to their careers page shows up
They create executables, which contain encrypted binary data. Then, when the executable runs, it decodes the encrypted data and pipes it into "sh".
The security is delusional here - the password is hard coded in the executable. It was something like "VIVOTEK Inc.".
Ghidra was able to create the C code and I was able to extract also the binary data to a file (which is essentially the bash script).
amazing tool
when i try to expand their faq, it seem to try an open a (presumabl) malicious link , i wont paste the link here just in case it is really malicious
I wonder what is the purpose of ghidralite dot com. SEO spam? Are they building trust and then will swap out the Download button with a poisoned binary.
Ghidra excels because it is extremely abstract, so new processors can be added at will and automatically have a decompiler, control flow tracing, mostly working assembler, and emulation.
IDA excels because it has been developed for a gazillion years against patterns found in common binaries and has an extremely fast, ergonomic UI and an awesome debugger.
For UI driven reversing against anything that runs on an OS I generally prefer IDA, for anything below that I’m 50/50 on Ghidra, and for anything where IDA doesn’t have a decompiler, Ghidra wins by default.
For plugin development or automated reversing (even pre LLMs, stuff like pattern matching scripts or little evaluators) Ghidra offers a ton of power since you can basically execute the underlying program using PCode, but the APIs are clunky and until recently you really needed to be using Java.
I think what NSA is likely to keep confidential are in-house plugins that are so specialized and/or underengineered that their publication would give away confidential information: stolen and illegitimate secrets (e.g. cryptographic private keys from a game console SDK), or exploits that they intend to deny knowledge of and continue milking, or general strategies and methods (e.g. a tool to "customize" UEFI images, with the implication that they have means to install them on a victim's computer).
Ghidra takes a program and unravels the machine code back into assembly and thus, something resembling C code. Allowing you to change behavior.
Cheat Engine doesn’t modify the binary. Ghidra can.
To clarify for other people who may not be familiar, (though I'm far from an expert on it myself) you can inject/modify asm of a running binary with CE. I'm not sure if there's a way to bake the changes to the exe permanently.
Edit: Wikipedia has a table with 1.0 being 2003 https://en.wikipedia.org/wiki/Ghidra