Computer Science 221: Information Retrieval:

Assignment 03

Fall 2008

Department of Informatics

Donald Bren School of Information and Computer Sciences

University of California, Irvine

Home | Administrative Policies | Course Structure | Resources & Materials | Calendar

Due 10/31/2008

  1. Java Program (100%)
    1. This assignment is to be done on your own.
    1. Set up an environment for the system developed in Assignment 02
    2. After 10am on 10/31, log into EEE and view the "quiz" that has the parameters for the following:
      1. Using your system
        1. Crawl a set of web pages
          1. Starting from the seed set that provided in the quiz.
          2. Using the regular expression filter that we give you in the quiz.
        2. Find the longest palindrome in those pages.
        3. Find the longest rhopalic in those pages (using a new letter).
        4. Find the longest lipogram in those pages
      2. Build a webgraph from the crawled pages
      3. Calculate the shortest path between two pages provided in the quiz.