Blogs

Pervasive Parallelism in Data Mining: Co-clustering of Netflix Data

authored by:Srivatsava Daruru(1), Matt Walker(2), Nena Marín(2), Joydeep Ghosh(1)

Abstract

Cache sucks

I attended a talk on the Intel's experimental 80-core processor given by Tim Mattson at SC'08 this week. Dr. Russell Winder captures one aspect of the talk nicely in his blog entry. It was fun to see Dr. Mattson's energy and enthusiam about some of the concepts brought out by the chip, one of the main ones being no cache. Another being the small instruction set, allowing a certain freedom when programming the cores (but also many constraints, such as the inability to code inner loops).

 

Data Intensive HPC

Pervasive Software CTO Mike Hoskins on the benefits of building parallelized multicore-powered Data-Intensive HPC

Can you describe your Data-Intensive HPC initiatives?

Come see us at SuperComputing '08

Multicore chips offer the potential to get more done not through faster processing, but by offering more processing on a single chip. It's only a "potential" to get more done, though, because applications must be written to do more than one thing at a time to see an actual performance boost.

Petabytes of data spilling on the floor.

A million seconds is 12 days.
A billion seconds is 31 years.
A trillion seconds is 31,688 years.

Last year IDC released the results of a study that found the world generated 161 exabytes of digital data the year before. How much data is that? A lot. Its 161,000 petabytes. Its 161 million terabytes. Its 161 billion gigabytes. All still really big numbers.

Java and HPC

Found this report (*)on James Goslings blog. The report is the result of using Java for HPC benchmarks compared to Fortran. Overall, the results look to be encouraging.