• What
    is this?

    You've landed on the AMD Portal on AnandTech. This section is sponsored by AMD. It features a collection of all of our independent AMD content, as well as Tweets & News from AMD directly. AMD will also be running a couple of huge giveaways here so check back for those.

    PRESENTED BY

Cloud = x86 and open source

From a high-level perspective, the basic architecture of Facebook is not that different from other high performance web services.

However, Facebook is the poster child of the new generation of Cloud applications. It's hugely popular and very interactive, and as such it requires much more scalability and availability than your average website that mostly serves up information.

The "Cloud Application" generation did not turn to the classic high-end redundant platforms with heavy Relational Database Management Systems. A combination of x86 scale-out clusters, open source websoftware, and "no SQL" is the foundation that Facebook, Twitter, Google and others build upon.

However, facebook has improved several pieces of the Open Source software puzzle to make them more suited for extreme scalability. Facebook chose PHP as its presentation layer as it is simple to learn, write, and read. However, PHP is very CPU and memory intensive.

According to Facebook’s own numbers, PHP is about 39 times slower than C++ code. Thus it was clear that Facebook had to solve this problem first. The traditional approach is to rewrite the most performance critical parts in C++ as PHP Extensions, but Facebook tried a different solution: the engineers developed HipHop, a source code transformer. Hiphop transforms the PHP source code into faster C++ code and compiles it with g++.

The next piece in the Facebook puzzle is Memcached. Memcached is an in-RAM object caching system with some very cool features. Memcached is a distributed caching system, which means a memcached cache can span many servers. The "cache" is thus in fact a collection of smaller caches. It basically recuperates unused RAM that your operating system would probably waste on less efficient file system caching. These “cache nodes” do not sync or broadcast and as a result the memory cache is very scalable.

Facebook quickly became the world's largest user of memcached and improved memcached vastly. They ported it to 64-bit, lowered TCP memory usage, distributed network processing over multiple cores (instead of one), and so on. Facebook mostly uses memcached to alleviate database load.

Facebook Technology Overview The Facebook Open Compute Servers
POST A COMMENT

62 Comments

View All Comments

  • fpsvash - Thursday, November 03, 2011 - link

    In the middle of the paragraph below the image caption, the sentence reads "...and offers better slightly better performance..."

    Other than that, nice post!
    Reply
  • InternetGeek - Thursday, November 03, 2011 - link

    It's interesting that no many players have taken a look at Open Compute. Reply
  • alent1234 - Thursday, November 03, 2011 - link

    it's a solution for a specific workload. there are still a lot of workloads that require the traditional model of big database servers

    unlike your bank, facebook's noSQL is not ACID
    Reply
  • FunBunny2 - Saturday, November 05, 2011 - link

    Well, yes a voice of reason. OTOH, the Facebook et al folks are convinced that their back to the COBOL era is the future. As if a toy application, albeit pervasive, is "innovation". Reply
  • Sivar - Saturday, November 05, 2011 - link

    It's a little difficult to look at a comment about Facebook being a toy application and take it seriously. Yes, Facebook is not directly processing bank transactions on a Tandem, but their site is used to conduct business -- and is even the basis for many businesses, all over the world.

    Zynga, the company that makes a few annoying games for Facebook, is worth $15 -- more than Electronic Arts.

    Nearly every major online publisher, including Anandtech, uses their API for content distribution and often as the entire forum system for discussion of publications.

    The founder is the youngest billionaire in history.

    Calling theirs a toy application sounds like a Blockbuster customer calling Redbox a toy. It's denial of an obviously successful, large, powerful, innovative company because they don't do things "the old way."

    I suspect what matters more is that the business is executing flawlessly, the actual problems with data loss or other non-ACID compliant traditional issues are minimal, and that they are making enough money that Google and Microsoft are feel seriously threatened.

    One last thing -- if you really look into what ACID compliance means (and I know you did not specifically mention the acronymn, but replied to someone that did) none of the current major DBMS's are truly ACID compliant. It's too slow. Not Oracle. Not MSSQL. Not Greenplum. Not Teradata. None of them. They may be closer than NoSQL or the like, but then it's all about the right tool for the job, right?
    Reply
  • Ceencee - Wednesday, November 09, 2011 - link

    This is true but ACID can be over-rated for many workloads. How many pieces of data HAVE to be consistent across the entire cluster to be valid? What about NoSQL with configurable consistency like Cassandra?

    NoSQL databases provide the holy grail of system growth which is horizontal scaling and this is no small thing for anyone who has worked with a very large RDBMS like ORACLE and implemented RAC to find it doesn't scale all that linearly for most workloads.
    Reply
  • ac2 - Thursday, November 03, 2011 - link

    Wouldn't the presence of the graphics on the HP server account for the 32W idle load savings? Reply
  • JohanAnandtech - Thursday, November 03, 2011 - link

    It is an ATI ES 1000, that is a server/thin client chip. That chip is only 2D. I can not find the power specs, but considering that the chip does not even need a heatsink, I think this chip consumes maybe 1W in idle. Reply
  • mczak - Thursday, November 03, 2011 - link

    ES 1000 is almost the same as radeon 7000/ve (no that's not HD 7000...) (some time in the past you could even force 3d in linux with the open-source driver though it usually did not work). The chip also has dedicated ram chip (though only 16bit wide memory interface) and I'm not sure how good the powersaving methods of it are (probably not downclocking but supporting clock gating) - not sure if it fits into 1W at idle (but certainly shouldn't be much more). Reply
  • JohanAnandtech - Thursday, November 03, 2011 - link

    I can not find any good tech resources on the chip, but I can imagine that AMD/ATI have shrunk the chip since it's appearance in 2005. If not, and the chip does consume quite a bit, it is a bit disappointing that server vendors still use it as the videochip is used very rarely. You don't need a videochip for RDP for example. Reply

Log in

Don't have an account? Sign up now