News Articles' Analysis with Metasearch Engines

Query Processing in Database Systems - CS 580

News metasearch engines aggregate several news sources thus providing results from several news sources in one view. Ideally, a news source should report events and information without making any judgments or without imposing the writer’s views on the reader. Clearly, since journalists are human beings, every news source is biased. When a news search engine pulls information from liberal sources, the articles retrieved will seem one-sided. Similarly, if a user queries a conservative search engine, the news articles will seem conventional and resistant to change.

Purpose

For any given query, we wish to find the critical differences between the liberals and the conservatives. By repeating this exercise for myriad queries, we will be able to identify the liberal and conservative stand-point on various issues. The issues differentiating the liberals from the conservatives should be identified.

Approach

a. Identify a set of liberal news search engines, and a set of conservative news search engines, say 10 to 15 each.

b. Build two Meta Search Engines – L, which connects to the liberal search engines; and C, which connects to the conservative search engines.

c. For any submitted query, retrieve a set of links to articles from C and L.

d. Retrieve x top articles from C and another x top articles from L.

e. From these 2x articles, identify top z words which are prevalent in L but not C and vice versa.

f. Retrieve the sentences containing these words and separate them into 2 columns.

Report

Click here for the detailed report.