![]() |
How does Facebook store all that information?
Hey guys!
I'm quite interested in scalability and was wondering, how does Facebook manage to have such a huge database?! Surely it's spread over multiple databases, how does it know what page/user is stored in what database(s)/server(s)? Might seem like a silly question but I've always wondered how and what the most efficient way would be. :-) |
There is something called relational databases.
So for example, they might have a table called members, which would have a field (column) tthat has an auto-increasing number, your first, last name, an e-mail and such. From then on, you are referred to as a number. Then they have a table called status updates for example, the fields there are probably another auto-increase number, a field that refers to the number on the member table (so they know who did it), and the text of your status update. To go further, they probably have a table called status update comments which has another auto-number, a field to reference what auto-number its referring to for the status update, a field that references the member table to know who posted the comment, then the comment text itself. Also, a database has a 4gb limit, so it may be spread across many databases. A search wields steps to get around the limit, if needed... Hope this helps. |
I think what he wanted to know was how they know which database that a certain member's wallposts are stored(for example).
Good information though, eventhough I knew all of what you wrote I'm still interested in the question I asked above. |
Short answer: Very carefully.
Long answer: They have networks of carefully crafted databases on many many servers. No one server could ever hope to serve that much traffic. They have exprets doing this, facebook as it is could never be created by anyone but advanced programmers. They, like all not small systems, would use realational databases. But that is just the start of it, how you relate the keys is important, more complication comes up when you have hundreds of servers trying to act as one database on the outside. I've personally never worked with databases took up more than a single server, so I don't know too much about how they do it. Books have been written on it, I would seggust searching amazon.com for some. |
Ah ok Tanax. I would have to agree with VI. I know with access you can split the database to front and back ends. A system this big, seems to call for this. I don't know how to do that with a mysql db.
|
I've heard that they have used clusters of servers (100's) that keep a large chunk of the most recent data in RAM. This would be a huge accomplishment, but I don't think that it's possible from a business side.
I would say that they separate all of the data into small clusters and sub-clusters, then those could be organized by date, predicted/actual traffic, type of data, ect.. Those sub-clusters would then have a reasonable amount of data to sift through (maybe only a few million rows), this would speed up the response and retrieval. Again, I don't know, I'm just guessing. |
Quote:
|
| All times are GMT. The time now is 08:16 AM. |
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0