A web crawler application created for COM S 311 (Analysis and Design of Algorithms). It uses breadth first search to generate a graph of web pages, starting from a seed URL and creating edges to all pages linked from that page (pages downloaded using jsoup). The graph generating method also has parameters "maxPages" and "maxDepth" to constrain the size of the graph. An inverted index containing the URLs, their content, and their indegrees is then created using the web graph. This index is then used to implement time-efficient search queries of the collected web pages.
Project link: https://git.ece.iastate.edu/ztj1/google-2