Publication Details

Framework for Planning, Running and Monitoring Cooperating Computations

JAROŠ, M. Framework for Planning, Running and Monitoring Cooperating Computations. Počítačové architektúry & diagnostika PAD 2017. Bratislava: Slovak University of Technology in Bratislava, 2017. p. 20-23. ISBN: 978-80-972784-0-3.
Czech title
Systém pro plánování, spouštění a monitorování kooperujících výpočtů
Type
conference paper
Language
English
Authors
Keywords

Automation, distributed computing, execution planning, job submission, monitoring, HPC, multiscale modelling, model coupling, service

Abstract

Realistic simulations need for their run very powerful computers. Computing infrastructures are growing in parallelism and becoming more diverse. This heads towards using more sophisticated computational techniques to take full advantage of the machine power. To describe a scientific problem, a number of different and cooperating models is used. This tends to force users to construct, execute, validate and analyse these models. The situation is much more complicated if the user is not an IT specialist. This causes a huge human effort to actions that might be out of a scientist's scope or could be provided automatically. This work presents a tool providing an automated planning, executing and monitoring cooperating and extensive computations. The approach used introduces the HPC as a service. Modular design enables extensions and unifies the access to different HPC systems through a simple client-server interface using standard web services. The dispatch server detects and enables concurrent execution of tasks and offers a level of fault tolerance.  

Annotation

Realistic simulations need for their run very powerful computers. Computing infrastructures are growing in parallelism and becoming more diverse. This heads towards using more sophisticated computational techniques to take full advantage of the machine power. To describe a scientific problem, a number of different and cooperating models is used. This tends to force users to construct, execute, validate and analyse these models. The situation is much more complicated if the user is not an IT specialist. This causes a huge human effort to actions that might be out of a scientist's scope or could be provided automatically. This work presents a tool providing an automated planning, executing and monitoring cooperating and extensive computations. The approach used introduces the HPC as a service. Modular design enables extensions and unifies the access to different HPC systems through a simple client-server interface using standard web services. The dispatch server detects and enables concurrent execution of tasks and offers a level of fault tolerance.  

Published
2017
Pages
20–23
Proceedings
Počítačové architektúry & diagnostika PAD 2017
ISBN
978-80-972784-0-3
Publisher
Slovak University of Technology in Bratislava
Place
Bratislava
BibTeX
@inproceedings{BUT144455,
  author="Marta {Jaroš}",
  title="Framework for Planning, Running and Monitoring Cooperating Computations",
  booktitle="Počítačové architektúry & diagnostika PAD 2017",
  year="2017",
  pages="20--23",
  publisher="Slovak University of Technology in Bratislava",
  address="Bratislava",
  isbn="978-80-972784-0-3",
  url="https://www.fit.vut.cz/research/publication/11475/"
}
Back to top