Hey everybody! Thanks so much for all the kind words here and on Twitter, it has been an awesome experience. I just wanted to pop out a quick note with an update on what I'm up to.
As soon as the dust settled, I sat down and started trying to rework the thread into longform and write up some thoughts - on what I wrote, what has happened since, and what I think will happen next, along with answers to a lot of questions and adjustments for valid critiques that I got on the thread. I've been writing more or less nonstop since.
So right now, my target is a 4-part series of posts about Twitter:
Part 1: Twitter's Terrifying Everyday Threats
Part 2: Twitter Really Can Die
Part 3: Predicting Twitter's Future
Part 4: You Can't Just Make A Twitter
Part 1 is a giant list, indexed by organization and team, of threats that teams at a social network of Twitter's size have to deal with every single day. And very importantly, these are all threats which are a hell of a lot harder to deal with when you're running a skeleton crew which had insufficient time for a proper tribal knowledge transfer.
Part 2 is a list of actually existential threats to Twitter the company, Twitter the infrastructure, and Twitter the product. Some are touched on in Part 1, but their existential nature is the subject of Part 2 - these are the things that you *cannot* get wrong, the things that you *cannot* allow to happen, and in a very real sense these are the things that *can* kill your company or cripple it so badly you might as well put it out of its misery. Some mean instant death, and some are the kind of slow death by papercut that an inexperienced managerial squad may never even recognize as threats in the first place.
Part 3 is where I to say what I actually think is going to happen to Twitter over the coming weeks, months, and years.
Part 4 is where I go over what we've learned so far about what Twitter is and what's likely to happen to it, and then talk about how Twitter's place in our lives and in the tech ecosystem: why the existing alternatives are poor alternatives, why the existing replacements are poor replacements, and why I personally think we're really unlikely to see Twitter reborn as anything else any time soon.
There's a lot left to do, but I've finished writing out about 70% of Part 1, half of Part 4, and have the basic contents sketched out for Part 1 and 2. No guarantees, but I'm aiming to have Part 1 out later this week if I can.
On a semi-related note - I started writing up something about my own history and qualifications in the field, because I wanted to establish that I do, in fact, know what I'm talking about. But I decided not to include it, because I think in this instance it distracts from the point too much and doesn't really accomplish anything. In the future I'll keep writing vignettes like the first post here, and I have some deep dives planned into projects I worked on in the past - but if you're are interested in a writeup of just who I am and how I got here, just let me know and I'll be happy to write it up. My people love any excuse to tell a good story.
Anyway, it's late here so I'd better call it. I'm feeling excited to be working on the project, and I'm learning a lot from trying. Thank you so much to everyone who encouraged me to give this writing thing a shot - it means more than I can say.
I got interviewed! By several outlets, but especially… Bloomberg! Let me tell you - a country boy's constitution isn't really cut out for this kind of thing. I managed to not vomit, but I also didn't really sleep for a couple nights there.
Funnily enough, I've been on vacation visiting my brother in Taiwan this entire time (including that oh-so-fateful afternoon when I decided to drink an iced coffee and bang out a thread about SRE stuff on my fintwit alt...). Thankfully he doesn't mind me being completely immersed in writing things. Usually it's code, but either way I think by now he's accustomed to it. We've still had plenty of time for stinky tofu, train trips, taking the dogs to the beach, and getting big smiles from neighbors when we go out to practice our Taiwanese.
The comments on the original thread were 99.9% pure positivity and fun, but man that 0.1% were mad. I had people call me a fraud because I thought Twitter runs on-prem (they do), that a tree could take out an aerial fiber line to a DC and cause a SEV (I was at FB when it happened), and that a full drive/partition can still cause failures in modern infra (I know it's far more rare than the olden days, but trust me - if, anywhere in your sprawling infrastructure, someone is going to be relying on some oddball enterprise program set up by vendors and interns in a hurry: you will someday find yourself staring at output from
df and cursing my name).
My two cents: If you get, like, viscerally angry at the idea that there are things in a large company's infrastructure *not* set up correctly by flawless actors who are taking 100% perfect advantage of the redundancy & resilience offered by well-designed modern systems at all times, using those systems to their full potential in all aspects of design, implementation, monitoring, and remediation... I suggest that you don't take up SRE as a career, because you're going to be mad every day for the next 20 years of your life.
I'm very much looking forward to your coming posts. I'm Old School, but enjoy opportunities to become slightly less so.
Back in the late 80's/early 90's one of my most humbling experiences was having the NSA explain to me how I was about 5 levels back in "security awareness" from what they thought the *absolute minimum* was, and they proceeded to educate me. It was the first time I had encountered "air gap" as a security level. :-) Needless to say, you can't do "air gap" and access the Internet at the same time, so you have to do all of the work to defend against Barbarians at the Gate. sigh.