Syntactic Classification based Web Page Ranking Algorithm

The existing search engines sometimes give unsatisfactory search result for lack of any categorization of search result. If there is some means to know the preference of user about the search result and rank pages according to that preference, the result will be more useful and accurate to the user. In the present paper a web page ranking algorithm is being proposed based on syntactic classification of web pages. Syntactic Classification does not bother about the meaning of the content of a web page. The proposed approach mainly consists of three steps: select some properties of web pages based on user’s demand, measure them, and give different weightage to each property during ranking for different types of pages. The existence of syntactic classification is supported by running fuzzy c-means algorithm and neural network classification on a set of web pages. The change in ranking for difference in type of pages but for same query string is also being demonstrated. The World Wide Web is an architectural framework for accessing linked documents spread out over millions of machines all over the Internet. It began in 1989 at CERN, the European center of Nuclear Research. At that time FTP data transfers accounted for approximately one third of Internet traffic, more than any other application. But after the introduction of WWW it had a much higher growth rate. By 1995, Web traffic overtook FTP data transfer and by 2000 it overshadowed all other applications. The popularity of WWW is largely dependent on the search engines. Search engines are the gateways to the huge information repository at the internet. Now anyone can quickly search for helpful cleaning tips, music lyrics, recipes, pictures, celebrity websites and more with search engines. Search engines consist of four discrete software components: Crawler or Spider: a robotic browser like program that downloads web pages; Indexer: a blender like program that dissects web pages that are downloaded by spiders; The database: a warehouse of the pages downloaded and processed; Search engine results engine: digs search results out of the database.

Free download research paper