I’m no longer active on Twitter. Find me at on Bluesky, or on Mastodon.
If I follow you here or know you personally, you can DM me for a Bluesky invitation code. (Sorry, don’t have enough invitation codes for people I don't know.)
In 1665, the University of Cambridge temporarily closed due to the bubonic plague. Isaac Newton had to work from home, and he used this time to develop calculus and the theory of gravity.
✨New educational materials!✨
Today I'm announcing two new, free resources I've written: an 8-lecture course on fundamentals of distributed systems, and a 30-page tutorial on elliptic curve cryptography.
My partner and I had a baby! I’m on parental leave for the next 3 months (plus planning to work part-time for a while later). Therefore I probably won’t be tweeting much. Feeling fortunate to be in a place with good maternity care and generous parental leave ❤️
Pro tip: instead of paying $399/year for
@safari
books online (frankly outrageous), you can get ACM professional membership for $99/year, which includes Safari access.
My partner is learning programming. When she struggles to figure something out, but I can solve it in a few seconds, I remind her: she can sit at the piano and easily sight-read a piece she’s never seen, while I have to struggle through note by note. Skills require practice.
Estimated number of clock cycles it takes to do various things on a modern CPU, e.g. atomic CAS or context switch: — useful for a concurrent systems course I'm currently teaching
I am concerned that a lot of computer systems research is solving problems that only huge tech companies have (eg. fast datacenter networking, big data analytics systems), rather than working on technologies that empower individual users and the underprivileged.
By the same token, it should be a sobering moment for computer science academia. With few exceptions, work that tries to bring accountability to big tech companies is relegated to the fringes of our discipline. CS these days cozies up to power far more than speaking truth to it.
Very excited to announce that I've been awarded a €1.3m grant to start a research group on local-first software at TU Munich
@TU_Muenchen
! This will enable us to build the foundation for collaboration software of the future.
I give away lots of work for free: conference talks, educational videos, research papers, open source code, blog posts. If you want me to continue doing this, please consider supporting me on Patreon! Thanks ✨
➡️
Blog post:
Netflix have made their own Change Data Capture (CDC) tool for MySQL and PostgreSQL… though maybe it would have been better to contribute to
@debezium
?
My book
@intensivedata
is apparently the second-best-selling book across
@OReillyMedia
's entire book catalog in 2019. Delighted that so many people are finding it useful!
How amazing is it that GPS actually works? Satellites are at least 20,000 km away; measuring the distance to ~2m accuracy means 0.00001% error. Also the satellite is moving at 3 km/sec and the measurement equipment fits in your phone. What other consumer device has such accuracy?
I was asked to give a keynote at the ACM Conference on Distributed and Event-based Systems (
@ACM_DEBS
). So I tried to summarise everything I know about events in 50 minutes!
🎥
📄
Hope you enjoy it ✨
Google's Service Weaver claims that it can hide the difference between RPC and local method calls. Just like CORBA claimed in 1991. Is it different this time? The docs say barely a word about handing failures, such as RPCs timing out
Apparently I’ve now crossed 30k followers... hello everybody and welcome! 👋 What are you hoping to hear about? Is miscellaneous distributed systems geekery good? 😊
I finally got round to reading
@ifesdjeen
's Database Internals. It's a nice book! If you enjoyed
@intensivedata
and want more detail on certain topics (especially storage engines and consensus), it's a good follow-on read.
I have officially started at
@TU_Muenchen
as of this week! Over the coming months and years I will be building up a research group to focus on local-first software ✨🧑🔬
Today in “distributed systems are hard”: I wrote down a simple CRDT algorithm that I thought was “obviously correct” for a course I’m teaching. Only 10 lines or so long. Found a fatal bug only after spending hours trying to prove the algorithm correct. 😭
New video! “CRDTs: The Hard Parts” 🎥
This talk explains several new research results in CRDTs, including problems in some published algorithms, weird edge cases, and work we’re doing to improve performance.
Our article “Online Event Processing: Achieving consistency where distributed transactions have failed”, with
@arberesford
and
@bsvingen
:
Now in print in the latest edition of
@CACMmag
: (paywall) and online in
@ACMQueue
: (free)
The distributed systems course covers many of the fundamental algorithms behind today's apps. It consists of 87 pages of detailed notes (including exercises) and 7 hours of lecture videos
To all interested in CRDTs: Marc Shapiro,
@anne_biene
and I have set up a little CRDT community website. Lots of links to papers, blog posts, talks, and implementations. Contributions welcome!
✨ New article: Peritext, a CRDT for rich-text collaboration 📄
This one was hard work. We went through at least half a dozen algorithm designs until we found one that worked!
Joint work with
@geoffreylitt
@sliminality
and
@pvh
at
@inkandswitch
Such a powerful idea; keeps getting reinvented under different names (event sourcing; lambda/kappa architecture; database inside-out/unbundled; state machine replication; etc).
Reflecting on the fact that I've built some flavor of the "immutable, append-only log + materialized views" pattern into every non-trivial software project I've built since 2006.
This is mind-blowingly clever work by
@justinetunney
. A single binary that runs natively on Windows/Linux/Mac/*BSD without any VM, implementing a web server that serves the content of a zip archive stored in the binary itself! 🤯🤯👏👏
I see parallels between musical instrument skills and programming skills. Many people learn them in their youth, but both are totally learnable as adults if you’re willing to put in the many hours of practice they require. It’s not natural talent, it’s persistence that matters.
Folks moving to Mastodon are swapping a service run by a capricious egomaniac for one where the admin of your home instance controls everything about your account. And you probably don't know what your server admin is like when you sign up. Is this really much better?
My student's experimental new database beats MySQL by approximately 700% on the TPC-C benchmark in initial tests. This research is going somewhere interesting!
I took a look at a GCSE Computing exam paper (an exam that 16-year-olds sit in the UK). It made me sad how dull, arbitrary, and pointless many of the questions are. Not appropriate for beginners. Surely this sort of material will put many kids off going into computing.
When users enter sensitive data into an app (eg. period trackers) and it’s stored unencrypted in the cloud, the app developer looks increasingly irresponsible. Local-only or local-first software with p2p data sync and/or end-to-end encryption needs to become the default.
The central server design, on the other hand, puts EVERYONE's data in one place. And someone might obtained it hacking into the app's servers, by buying it from them, or by serving them with a legal order to turn it over. Any of these could happen without users' knowledge. 7/
I set myself a goal of not working weekends (which I used to do regularly). Looking at my GitHub activity, it seems like I've been doing okay. Hooray for avoiding burnout!
This thoughtful article by
@martinfowler
on the pros and cons of remote working is made even better by featuring a tortoise on a rocket-powered skateboard 🐢🔥
Every technology contributes to determining who has what power. Therefore, every technology is political. If a technology seems apolitical, that probably just means that it supports the incumbents and doesn’t challenge the status quo.
Likewise, every organisation is political.
I have never waited more than 2 minutes to vote in an election in UK/Germany. With queues like this, voter suppression campaigns, gerrymandering, a two-party duopoly, extreme partisanship, etc. the US doesn’t really look like the shining model democracy that it was once seen as.
Did you know that the breadth-first-search algorithm in most textbooks is needlessly inefficient? A simple tweak can sometimes turn an O(n^2) runtime into O(n)
New paper alert! 📄 Most CRDT algorithms assume that all participants correctly follow the protocol. I show that we can make CRDTs robust against Byzantine behaviour, and it's not even terribly difficult.
Our group’s work on CRDTs and collaboration software is built explicitly upon a philosophy of trying to shift power away from cloud providers, and back towards end users. More on this in our paper-cum-manifesto on “local-first software”
Modern business jargon uses the word “talent” to refer to employees. Is it just me, or is this almost as offensive as calling people “resources”? I find “talent” dehumanising, and it focuses the attention on innate characteristics rather than developing learnable skills.
Bitcoin energy consumption hits a new all time high of nearly 9GW, comparable to Chile, a country with 18M people.
Carbon footprint is ~37 Mt CO2 annually, about that of New Zealand.
And yet it still does ~4 transactions per second...
It’s so heartwarming when strangers come up to me and tell me how my talks/book have helped them (in designing their systems, getting a job, …) ❤️ Thank you!
I just re-recorded my lecture on the Raft consensus algorithm, since the previous version had a couple of mistakes. New version here:
Accompanying lecture notes:
I am so delighted that “The Art of Doing Science and Engineering” by Richard Hamming (inventor of error-correcting codes and many other things) is now available in print again. Thank you
@stripepress
@_TamaraWinter
for sending me a copy!
Wow. Certain SSDs have a firmware bug causing them to irrecoverably fail after exactly 32,768 hours of operation. SSDs that were put into service at the same time will fail simultaneously, so RAID won't help. 😱
adding "
@NotionHQ
going offline, apparently due to problems renewing their Somalian (.so) domain name" to my list of reasons why local-first software is a good idea
Tried using AWS Lambda/API Gateway for a little side project. The getting started wizard first asks you to choose between a “REST API” and a “HTTP API” (???). Then it pre-populates the new function with broken example code that crashes when you try to run it.
“Is Kafka a Database?”
That’s the (slightly provocative) title of my keynote at
#kafkaSummit
today. It will be live-streamed, starting at 9am Pacific Time / 5pm UK time
Excited to announce my latest fun project: I wrote an illustrated children's book on cryptography together with Mitch Seymour
@_round_robin
! We're launching it this Thursday, and to celebrate we're doing an AMA on Reddit (4pm UK / 8am Pacific). Come join us!
Happy that our paper on a move operation for CRDT trees has finally been accepted by IEEE TPDS! We've spent over two years trying to get this thing published, eventually successful on the sixth submission…
New paper, which
@pvh
and I wrote together: “PushPin: Towards Production-Quality Peer-to-Peer Collaboration”.
It's about our hard-won experience trying to build peer-to-peer local-first software using CRDTs. Very practical, lots of insights. Take a look!
My partner complained that she has to report her covid test result twice a week on a government website, requiring ~20 clicks plus login.
So we wrote a Selenium script together to automate the whole thing, simulating clicks in a web browser. Much faster, much happier 😆
What’s current best practice for GDPR compliance (in particular, right to deletion) in systems with append-only logs/event sourcing/bl*ckchains, which are supposed to keep history forever?
Fascinating Google paper on CPU cores that occasionally (and nondeterministically) compute the wrong result. Detecting and mitigating this is going to be an interesting challenge.
New paper! 😎 In which
@heidiann360
and I explore Git-like hash graphs, Bloom filters, and peer-to-peer systems that are immune to Sybil attacks.
📄 Paper:
📎 Blog post:
ACM Europe calls for an EU-wide carbon tax, with no exemptions for computing, and strongly recommends against “any proof-of-work based distributed ledger technology” due to its “exorbitant power consumption”. Hooray! 🇪🇺🌍
Seph Gentle
@josephgentle
, author of ShareDB, explains how he came to realise that CRDTs are the future. And hardly anybody knows knows collaboration algorithms better than Seph!
I used to think the NATO target of spending 2% of GDP on defence was excessive, and that the US’s 3.7% was far too militaristic. But in the last week, any pacifist tendencies I once had have been shattered. Bring on those defence budgets.
Decentralised web technologies will need to figure out when, why, and how to ban abusive users. Without central control, how do systems determine what behaviour is acceptable? Some thoughts in my latest blog post