5 Key Lessons From The Crowdstrike Outage
Will the world’s biggest IT outage change the tech landscape forever ??
WOW
This is all I can say from reading the eyes and interacting with my tech network these last couple of days.
The Crowdstrike outage, already dubbed “the biggest IT outage in history,” has shaken the world, and rightfully so
A simple software update brought airports, banks, governments, and even emergency services screeching to a halt with the dreaded Blue Screen Of Death flashing everywhere.
Unlike most movies, where this turns out to be the work of a cybercriminal mastermind, it turned out NOT to be a cyberattack.
It was a screw-up in the latest update of Crowdstrike — A popular security solution
Crowdstrike is a big name within the Cybersecurity space and trusted by some of the biggest corporations and governments across the globe
To say this company has damaged their standing in the industry would be a bit of an understatement!
But that aside .. it is important to extract the key learnings from this incident to make sure we do not make the same mistakes
While this is an evolving situation, here are a few key takeaways you can benefit from right now
1 — Learn Not Mock
I am guilty of this myself!
While it is fun to post memes and laugh at how stupid some companies can be, remember that your company could be in the news next time.
The risk of human error is always present, even in our new AI-driven world
Have a helpful attitude and wish Crowdstrike well, as they are facing the very worst situation and PR crisis that a company can face right now
It will take a long time for them to rebuild their reputation within the Cybersecurity industry and enjoy the same status they had before
2 — Cyberattack or NOT?
This is where I disagree strongly with Crowdstrike
Their official statement as of today still reads, “This issue is not the result of or related to a cyberattack.”
I beg to differ, as availability is one of the key tenets of Cybersecurity and this incident very much relates to it.
This will most definitely get flagged as a Cybersecurity incident, with some CIOs jittery about what endpoint agent is going to collapse next
The industry that is going to be impacted most is Cybersecurity, with CISOs having to justify every solution they have running on Servers and Endpoints.
Expect difficult conversations between CISOs and Vendors in the following weeks and months!
3 — Threat Model This ASAP
It still blows my mind when I hear some people saying
“We use Macbooks, so we dodged this bullet.”
“We do not use Crowdstrike, so Thank God !”
Argh...
Today it was Crowdstrike .. tomorrow it will be something else
Today’s IT environments are a mishmash of software agents and vendors which are slowly becoming single points of failures
It is crucial to assess this scenario and ask yourself:
What happens if all my Windows servers and endpoints get taken out?
Are we able to move to a cloud-based service?
Do we have other endpoint agents we can rely on?
Rest assured that cyber criminals are already seeing the damage this outage has caused and thinking about how they can profit from it!
4 — Check your patch management processes
Never patch before or on weekends
When you patch .. do it in phases and not a bulk update
You can imagine the millions of IT support people cursing Crowdstrike for ignoring these simple lessons
Even when the fix was released … the initial manual nature of the fix meant scaling that to thousands of servers/endpoints was going to be a nightmare for IT support teams
Add to that most companies using Crowdstrike have encrypted their servers, so putting them into safe mode is not straightforward
Not to mention, cloud-based servers cannot just be put into safe modes for fixing; you have to detach storage, fix it, and then attach it again
This will be a massive test for companies and their disaster recovery processes
Thankfully, a lot of scripts have been released that automate these processes, but the initial support was a nightmare
If your company releases patches, this will be a good time to re-assess your patching practices!
Never patch on the weekends, and if you do, do it in a phased approach that allows you to roll back to a safe state
( I cannot believe I have to mention this explicitly in 2024! ).
5 — Re-look at your software supply chain
The Software Supply Chain is one of the biggest blind spots in tech and Cybersecurity.
No software is made 100% from scratch; it is a mishmash of software libraries, companies, and dependencies.
Even if you cannot remove these dependencies .. you can at least be aware of what you have and its risk posture.
Assume that things will fail/get compromised, and then prepare based on that assumption.
I hope you enjoyed reading this. If the concept of Software Supply Chain security interests you then do check out my Udemy course on this via the discounted link.
Good luck to you and your IT team if you have been impacted by this outage !
Hope you are back to 100% soon






Great set of lessons. It wasn’t a patch that caused this problem, but poor coding practices. An empty meta content update file led to a null pointer exception in Crowdstrike’s buggy kernel driver code (unchanged during this update). This led to the Windows OS halting to prevent further damage. Had Crowdstrike developers wrote the kernel driver code in a defensive manner, none of this would’ve happened.