May 5, Tuesday

Live Data Moves
Speaker: Ian Foster, Argonne National Laboratory & The University of Chicago

Abstract: Data has to move to be truly alive: sitting untouched in a storage system, it only decays. To be useful, data must be in constant motion, from source to destination, disk to memory, and memory to processor, and then on to other destinations. Only thus can data be transformed from raw observations to more refined inferences, and ultimately, perhaps, to knowledge. The entire research enterprise is therefore a vast distributed system. Who better than those working in parallel and distributed computing to understand and improve its functioning? I discuss these perspectives and suggest some research challenges for us to consider.

Short biography: Ian Foster is Director of the Computation Institute, a joint institute of the University of Chicago and Argonne National Laboratory. He is also an Argonne Senior Scientist and Distinguished Fellow and the Arthur Holly Compton Distinguished Service Professor of Computer Science. Ian received a BSc (Hons I) degree from the University of Canterbury, New Zealand, and a PhD from Imperial College, United Kingdom, both in computer science. His research deals with distributed, parallel, and data-intensive computing technologies, and innovative applications of those technologies to scientific problems in such domains as climate change and biomedicine. Methods and software developed under his leadership underpin many large national and international cyberinfrastructures. Dr. Foster is a fellow of the American Association for the Advancement of Science, the Association for Computing Machinery, and the British Computer Society. His awards include the Global Information Infrastructure (GII) Next Generation award, the British Computer Society's Lovelace Medal, R&D Magazine's Innovator of the Year, and an honorary doctorate from the University of Canterbury, New Zealand. He was a co-founder of Univa UD, Inc., a company established to deliver grid and cloud computing solutions.

May 6, Wednesday

Challenges of Big Data in Scientific Discovery
Speaker: Benjamin Wah, The Chinese University of Hong Kong

Abstract: Big Data is emerging as one of the hottest multi-disciplinary research fields in recent years. Big data innovations are transforming science, engineering, medicine, healthcare, finance, business, and ultimately society itself. In this presentation, we examine the key properties of big data (volume, velocity, variety, veracity, and value) and their relation to some applications in science and engineering. To truly handle big data, new paradigm shifts (as advocated by the late Dr. Jim Gray) will be necessary. Successful applications in big data will require in situ methods to automatically extracting new knowledge from big data, without requiring the data to be centrally collected and maintained. Traditional theory on algorithmic complexity may no longer hold, since the scale of the data may be too large to be stored or accessed. To address the potential of big data in scientific discovery, challenges on data complexity, computational complexity, and system complexity will need to be solved. In particular, cloud computing will be a platform for supporting big data applications. We illustrate these challenges by drawing on examples in various applications in science and engineering.

Short biography: Benjamin W. Wah is currently the Provost and Wei Lun Professor of Computer Science and Engineering of the Chinese University of Hong Kong. He also serves as the Chair of the Research Grants Council of Hong Kong. Before then, he served as the Director of the Advanced Digital Sciences Center in Singapore, as well as the Franklin W. Woeltge Endowed Professor of Electrical and Computer Engineering and Professor of the Coordinated Science Laboratory of the University of Illinois, Urbana-Champaign, IL. He received his Ph.D. degree in computer science from the University of California, Berkeley, CA, in 1979. He has received a number of awards for his research contributions, which include the IEEE CS Technical Achievement Award (1998), the IEEE Millennium Medal (2000), the IEEE-CS W. Wallace-McDowell Award (2006), the Pan Wen-Yuan Outstanding Research Award (2006), the IEEE-CS Richard E. Merwin Award (2007), the IEEE-CS Tsutomu Kanai Award (2009), and the Distinguished Alumni Award in Computer Science of the University of California, Berkeley (2011). Wah's current research interests are in the areas of big data applications and multimedia signal processing.

Wah cofounded the IEEE Transactions on Knowledge and Data Engineering in 1988 and served as its Editor-in-Chief between 1993 and 1996, and is the Honorary Editor-in-Chief of Knowledge and Information Systems. He currently serves on the editorial boards of Information Sciences, International Journal on Artificial Intelligence Tools, Journal of VLSI Signal Processing, and World Wide Web. He has served the IEEE Computer Society in various capacities, including Vice President for Publications (1998 and 1999) and President (2001). He is a Fellow of the AAAS, ACM, and IEEE.

May 7, Thursday

Leveraging Python in Parallel Platforms: a task-based approach and its integration with a new generation of object‐based storage layers.
Speaker: Rosa Badia, Barcelona Supercomputing Center

Abstract: Computing platforms have been evolving rapidly in the recent years. Technologies such as multi and manycores, accelerators such as GPUs and new persistent storage devices such as SSDs are examples of disruptions that required the attention of the application developers. The importance of programming models to enable applications to efficiently run in such platforms has been recognized in the recent years by the computer science and scientific community in general. Ideally, programming models should provide an interface by which the application developer expresses algorithms and ideas in a platform-agnostic way. Aspects such as programmability, portability and performance optimization have been taken into account when proposing such models for parallel platforms. What is more, the new persistent storage devices may represent a revolution in the way data is stored.

The talk will review aspects related to these recent changes and how have been addressed at BSC with the PyCOMPSs programming model, a task-based programming model that offers a platform unaware interface. The same PyCOMPSs code can run in a cluster (even in Intel Phi processors) or in a heterogeneous cloud. Additionally, PyCOMPSs is currently being integrated with new methodologies to store and access data beyond the traditional file systems.

Short biography: Rosa M. Badia holds a PhD on Computer Science (1994) from the Technical University of Catalonia (UPC). She is a Scientific Researcher from the Consejo Superior de Investigaciones Científicas (CSIC) and team leader of the Grid Computing and Cluster research group at the Barcelona Supercomputing Center (BSC). She was involved in teaching and research activities at the UPC from 1989 to 2008, where she was an Associated Professor since year 1997. From 1999 to 2005 she was involved in research and development activities at the European Center of Parallelism of Barcelona (CEPBA). Her current research interest are programming models for complex platforms (from multicore, GPUs to Grid/Cloud). The group lead by Dr. Badia has been developing StarSs programming model for more than 10 years, with a high success in adoption by application developers. Currently the group focuses its efforts in two instances of StarSs: OmpSs for heterogenoeus platforms and COMPSs/PyCOMPSs for distributed computing (i.e. Cloud). Dr Badia has published more than 150 papers in international conferences and journals in the topics of her research. She has participated in several European projects, for example BEinGRID, Brein, CoreGRID, OGF-Europe, SIENA, TEXT and VENUS-C, and currently she is participating in the project Severo Ochoa (at Spanish level), ASCETIC, The Human Brain Project, EU-Brazil CloudConnect, and TransPlant and it is a member of HiPEAC2 NoE.