Google Analytics and Bigtable

Scary, on many levels

Google Analytics and Bigtable

Another tidbit I found curious in the Google Bigtable paper was the massive size of the Google Analytics data set stored in Bigtable.

The paper says that 250 terabytes of Google Analytics data are stored in Bigtable. That’s more than all the images for Google Earth (71T). It is the second largest data set in Bigtable, behind only the 850T of the Google crawl.

Why is it so big? The way I had assumed Google Analytics worked is that it maintained only the summary data for each website. That would be a very small amount of data, nowhere near 250T.

Instead, it appears Google Analytics keeps all the information about user behavior on all sites using Google Analytics permanently, online, and available for various analyses. That would explain 250T of data.

(Via Geeking with Greg.)

Advertisements

No comments yet

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s