NexTech 2021 Congress
October 03, 2021 to October 07, 2021 - Barcelona, Spain

  • UBICOMM 2021, The Fifteenth International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies
  • ADVCOMP 2021, The Fifteenth International Conference on Advanced Engineering Computing and Applications in Sciences
  • SEMAPRO 2021, The Fifteenth International Conference on Advances in Semantic Processing
  • AMBIENT 2021, The Eleventh International Conference on Ambient Computing, Applications, Services and Technologies
  • EMERGING 2021, The Thirteenth International Conference on Emerging Networks and Systems Intelligence
  • DATA ANALYTICS 2021, The Tenth International Conference on Data Analytics
  • GLOBAL HEALTH 2021, The Tenth International Conference on Global Health Challenges
  • CYBER 2021, The Sixth International Conference on Cyber-Technologies and Cyber-Systems

SoftNet 2021 Congress
October 03, 2021 to October 07, 2021 - Barcelona, Spain

  • ICSEA 2021, The Sixteenth International Conference on Software Engineering Advances
  • ICSNC 2021, The Sixteenth International Conference on Systems and Networks Communications
  • CENTRIC 2021, The Fourteenth International Conference on Advances in Human-oriented and Personalized Mechanisms, Technologies, and Services
  • VALID 2021, The Thirteenth International Conference on Advances in System Testing and Validation Lifecycle
  • SIMUL 2021, The Thirteenth International Conference on Advances in System Simulation
  • SOTICS 2021, The Eleventh International Conference on Social Media Technologies, Communication, and Informatics
  • INNOV 2021, The Tenth International Conference on Communications, Computation, Networks and Technologies
  • HEALTHINFO 2021, The Sixth International Conference on Informatics and Assistive Technologies for Health-Care, Medical Support and Wellbeing

NetWare 2021 Congress
November 14, 2021 to November 18, 2021 - Athens, Greece

  • SENSORCOMM 2021, The Fifteenth International Conference on Sensor Technologies and Applications
  • SENSORDEVICES 2021, The Twelfth International Conference on Sensor Device Technologies and Applications
  • SECURWARE 2021, The Fifteenth International Conference on Emerging Security Information, Systems and Technologies
  • AFIN 2021, The Thirteenth International Conference on Advances in Future Internet
  • CENICS 2021, The Fourteenth International Conference on Advances in Circuits, Electronics and Micro-electronics
  • ICQNM 2021, The Fifteenth International Conference on Quantum, Nano/Bio, and Micro Technologies
  • FASSI 2021, The Seventh International Conference on Fundamentals and Advances in Software Systems Integration
  • GREEN 2021, The Sixth International Conference on Green Communications, Computing and Technologies

TrendNews 2021 Congress
November 14, 2021 to November 18, 2021 - Athens, Greece

  • CORETA 2021, Advances on Core Technologies and Applications
  • DIGITAL 2021, Advances on Societal Digital Transformation

 


ThinkMind // CLOUD COMPUTING 2012, The Third International Conference on Cloud Computing, GRIDs, and Virtualization // View article cloud_computing_2012_8_30_20149


Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture

Authors:
He Huang
Shanshan Li
Xiaodong Yi
Feng Zhang
Xiangke Liao
Pan Dong

Keywords: high-performance computer; massive data processing; MapReduce paradigm.

Abstract:
MapReduce has emerged as a popular and easy-to-use programming model for numerous organizations to deal with massive data processing. Present works about improving MapReduce are mostly done under commercial clusters, while little work has been done under HPC architecture. With high capability computing node, networking and storage system, it might be promising to build massive data processing paradigm on HPCs. Instead of DFS storage systems, HPCs use dedicated storage subsystem. We first analyze the performance of MapReduce on dedicated storage subsystem. Results show that the performance of DFS scales better when the number of nodes increases; but, when the scale is fixed and the I/O capability is equal, the centralized storage subsystem can do a better job in processing large amount of data. Based on the analysis, two strategies for reducing the network transmitting data and distributing the storage I/O are presented, so as to solve the problem of limited data I/O capability of HPCs. The optimizations for storage localization and network levitation in HPC environment respectively improve the MapReduce performance by 32.5% and 16.9%.

Pages: 186 to 191

Copyright: Copyright (c) IARIA, 2012

Publication date: July 22, 2012

Published in: conference

ISSN: 2308-4294

ISBN: 978-1-61208-216-5

Location: Nice, France

Dates: from July 22, 2012 to July 27, 2012

SERVICES CONTACT
2010 - 2017 © ThinkMind. All rights reserved.
Read Terms of Service and Privacy Policy.