Seminar on "Scalable Platforms for "Big Data" Applications" 
Friday, 10 August 2012, 10:00
by Prof. Yogesh Simmhan

ABSTRACT: The pervasiveness of technology is providing the unprecedented ability to observe the physical, social and cyber worlds. This offers access to massive, complex and diverse datasets, called "Big Data", which can be used to manage and optimize these systems. At the same time, advances in computing are making large-scale, distributed resources on Cloud and HPC platforms more accessible. The successful fusion of this data deluge with computational capability has the potential for transformative advances to science and society.

In this talk, Prof. Simmhan will review his research on distributed data and computing systems along two dimensions: (1) Scalable platforms for massive data management and analytics, and (2) Adaptive programming frameworks for dynamic data on Clouds. The talk will discuss architectures for scalable and secure scientific data repositories using distributed databases and Clouds, and their efficient mining using the MapReduce platform. It will also examine unique challenges for reliably coordinating, ingesting and analyzing continuous data and event streams using adaptive workflow frameworks on clusters and Clouds. Applications from the Pan-STARRS Astronomy Survey and the Los Angeles Smart Power Grid projects will serve as exemplars. Some thoughts on data-driven science and engineering, and the role of Clouds and future computing platforms will be shared in summary.

 BIO: Prof. Yogesh Simmhan is a Research Assistant Professor at the Electrical Engineering Department and Associate Director of the Center for Energy Informatics in the University of Southern California (USC). His research area is on distributed data and computing systems that spans Cloud Computing, HPC, distributed data and metadata management, and scalable software architectures for eScience and eEngineering applications. His  interests include workflow and dataflow programming abstractions, robust execution frameworks, and provenance and semantic information management.
 Simmhan's current research is on distributed and adaptive software infrastructure for the energy informatics domain, and he serves as a project  manager in DOE's Los Angeles Smart Grid Project. Simmhan has a Ph.D. in Computer Science from Indiana University and was a PostDoc at Microsoft  Research. He is an IEEE and an ACM member.

Location Seminar hall, C-MMACS Network building


