Virginia Tech® home

Efficient Web Archive Searching

Loading player for /content/dam/icat_vt_edu/events/EWAS_ICAT_Video.mp4...

ORGANIZATION

Multimedia/Hypertext Group 9, Virginia Tech

EXHIBITORS


Ming ChengCollege of Engineering, Computer Science
Yijing Wu, College of Engineering, Computer Science Multimedia/Hypertext Group 9
Xiaolin Zhou, College of Engineering, Computer Science Multimedia/Hypertext Group 9
LIN ZHANG, College of Engineering, Computer Science Multimedia/Hypertext Group 9
Jinyang Li, College of Engineering, Computer Science Multimedia/Hypertext Group 9

WHAT HAPPENS HERE?

This project aims to find a method to convert URLs to a sortable and shortened format locally to improve web archive access efficiency. The audience can compare the efficiency of the new algorithms.

WHAT WAS THE PROCESS?

First, we need to understand the composition of the URL. Decide which parts of the URL should be used. Then, search for useful algorithms and compare the assumed efficiency. Implement the selected algorithms and test their actual efficiency. Pick the best algorithm for the research results. The required technologies are PyArrow, python, and parquet.

PROJECT

*
EWAS_Final_Presentation.pdf
*
cs4624_ewas.pdf