Scaling memcached at Facebook

iseff · on Dec 12, 2008

I must say, Facebook does a lot of interesting work. Things that hardcore engineers would definitely enjoy doing.

But also things I would hate to support.

For instance, I would NEVER want to have to support making a one-off change to MySQL as Facebook has done, to enable them to open a new data center (http://www.facebook.com/note.php?note_id=23844338919&id=...). What happens when MySQL is updated? They can no longer just pull the latest version. They have to pull the source, modify it appropriately, test, and then deploy. And if there are bugs, things could get very tricky, very fast.

And this memcached scaling seems like another example. Really cool concept, changing many low-level Linux things. But, really, do they want to be in the business of supporting those things? I wouldn't.

I realize they have massive scale. And I realize that at massive scale, standard solutions may not work. But perhaps they should find ways to do things at a slightly higher level, in a layer they fully control. This may also allow them to focus more on their core competencies and not waste precious developer time.

UPDATE: To make matters worse, from a mail on the memcached mailing list about this post:

<quote>

I think the results speak for themselves, but I don't know that a merge can actually occur.

The tree the published is entirely unrelated from the trees the rest of us are working on. There's no common ancestry or even similar directory layout. As published, it sort of puts us in a position to either reimplement everyone else's work, or reimplement the facebook work.

If anyone at facebook is listening, is it possible at all to add this work onto the codebase where everyone else has been working? We've got a lot of bug fixes and features we'd really like to not throw away here:

http://github.com/dustin/memcached/tree/rewritten-bin

</quote>

staunch · on Dec 13, 2008

On Facebook or Google scale it makes total sense. They're using 800 servers instead of 3,200. That's probably $5-$10 million dollars in savings. Easily enough to pay the salaries of a few developers to maintain this stuff even if that was their only job.

It does seem like it's just flimsy excuses not to merge up the changes, but hopefully they'll put more emphasis on it after a while.

retyred · on Dec 13, 2008

you can't put fb and google in the same category. fb is still surviving on investment capital and frankly could end up in a serious cashcrunch. google has a blank check for hardware. both companies have reason to worry about hardware expenditures, but facebook's is more imminent

jbert · on Dec 12, 2008

We enhance open source software for use on our platform.

We'd love to get our changes into upstream, but it isn't always easy to make the time for that. As you say, all kinds of issues are raised by having our own variant, but fundamentally, if we need to fix (or add) something, we need to do it.

Merging from upstream is just merging. It's a pain and a cost, but we can choose when to do it (largely) and if it's too painful then we can spend the time to merge up.

gsmaverick · on Dec 12, 2008

Google does most of these things in fact they do even more. They have their own web server package, custome *nix builds as far as I know, custom file system, map reduce, a programming language, BigTable, etc, etc. So nothing new!

easp · on Feb 12, 2009

You really think a site the scale of Facebook just pulls the latest version of mySQL and deploys it? I think even smaller sites like Smugmug put new releases through extensive testing before deployment. It seems to me that pulling the source, applying their patch and building is probably a pretty small part of the effort of rolling out a new version.

jeffesp · on Dec 13, 2008

I find the introduction of a hybrid polling/interrupt driven network interface really interesting. Mainly because we did the same work in 2003 on NetBSD for a wireless router that never made it to market. We were at the opposite end of the spectrum. Basically, we were running on a 100 MHz 486 and spent too much time on interrupts with all the network traffic. It's funny that it is the same situation for them on the latest systems out there (of course for some of the same, but some different reasons as well).

lallysingh · on Dec 13, 2008

Also, for all that complaining about Linux, anyone look at other Unices?

retyred · on Dec 13, 2008

used to work with paul saab (ps) at yahoo. super smart as this post shows. could never get them to use freebsd paul? :)

wmf · on Dec 12, 2008

Because we have thousands and thousands of computers, each running a hundred or more Apache processes, we end up with hundreds of thousands of TCP connections open to our memcached processes.

I guess using a single-process Web server would be too easy; it's practically cheating.

look_lookatme · on Dec 12, 2008

Wow, you're a genius. Facebook should hire you.