Are you curious about the way Web search engines provide users with a list of URLs after just a few keywords are entered? This article gives an overview on the core engine that makes this possible.
The authors start by discussing the challenges of the Internet and the architecture of Web search engines. The major challenges--the size, rapid change, lack of coherence, and interlinked nature of the Internet--introduce the rest of the article. Chapter Two discusses the discovery of information from the Internet, with a detailed explanation of the challenges of crawler models in respect to page selection and page refresh. Chapter Three introduces the storage and distributed Web repository. In chapters Four and Five, the authors present the most popular indexing architectures used in Web search services and ranking and link analysis.
The article covers in detail the fundamental components of Web search engines and the most common design and implementation. For each component discussed, a conclusion section is provided to summarize the concepts and give further alert on challenges to be addressed in the future. Theoretical analysis and arguments are supported by the authors’ own experiments in which data and statistics are provided.
The article is most suitable for readers who are interested in the design of Web search engines. In addition, readers who want to submit their Web pages to search engines can also be benefit from reading this article to increase the matching-rate and ranking of their Web pages among the search engines. For other readers who search the Internet to find certain information, this article is also a good source to understand the technology involved, though it does not contain direct guidelines towards the most popular Web search engines on the market.