Computers handle a lot more than we think. Take, for example, when a coworker sends a zip file via email. To read the documents within the zip file, a computer user has to decompress the file, which takes up memory space and processing time. As more and more data gets processed and stored, computers begin to show signs of performance issues and slow down.
Fahad Saeed, associate professor in the School of Computing & Information Sciences within the College of Engineering & Computing, and Muhammad Haseeb, a computer science doctoral student, recognize this universal challenge and specialize in enhancing the compression and computing of big data.
Recently, the team of computer scientists was awarded a patent titled, “Methods and Systems for Compressing Data,” to reduce the size of proteomics data in computers’ desktop memory, while ensuring data sets can be processed without the need for decompression. Proteomics refers to the large-scale study of proteins.
To make sense of these data sets, Saeed and Haseeb are using specific techniques to handle big data.
“Since mass spectrometry data, which measures proteomics, is large, the index that holds that data is also becoming very large,” said Saeed. “It’s getting to a point where our everyday laptops, and desktops, cannot handle the size of the index.”
Indexing allows computer algorithms to be able to identify particular data sets that users are looking for – the location of folders, files, or records. Indexing identifies the location of data based on file names, text