Monday, May 7, 2012

Caffeine - a new system of indexing Web sites from Google



Today we are happy to announce the completion of work on our new system of indexing websites - Caffeine. Caffeine at 50% ... Whatever you're looking for - news, blog posts or on the forums - you need page is now even faster fall in our index, but you can find them faster.

A few words for those who are not familiar with search technology. When you are looking for information on Google, you're not working with ... The search is performed on an index of the network created by Google. This index is very similar to the index at the end of the book that helps you find the information you need.

Why do we need to create a new system of indexing? . First, the amount of information in the network increases, and its presentation formats are becoming more diverse. At the current site you will find videos, images, news and updates in real time. Pages are richer and more complex in structure. Second, people expect much more from the search. Members interested in the latest information on the topic, and publishers want their audience to find the material immediately upon publication.

To keep pace with the rapid development of network and meet the growing expectations of users, we created a system of Caffeine. In the picture you can see how to work the old system of indexation and now works as a new.



In our old index had several layers, some of which are updated regularly than others. Most of the index is updated every two weeks. To update the index of the layer, we need to analyze the entire network, thus creating a delay between the moment of finding the page, and so, when it became available to the user.

Caffeine, in turn, allows us to analyze the information on the Internet, ... This means that search results will display the most recent information, regardless of the time and place of publication.

Every second, the system handles hundreds of thousands of pages. If we imagine that these pages of paper, a stack of them would grow at a rate of about 5 kilometers per second. The database contains about Caffeine 100 million. gigabytes of data, and new information is added at a rate of several hundreds of thousands of gigabytes per day. You would need 625 000 iPod-s with the largest amount of memory to store the array of information.

When you create Caffeine we focused on the future of the Internet. This system is an excellent foundation for building other, more rapid and the volume index, since it can better adapt to the development of a network. This summer there will be other improvements. Stay tuned!.

No comments:

Post a Comment