JabRef references

Schneider J, Fuhr D, Korte E and Thim C (2017), "Security-Self-Assessment in kritischen Infrastrukturen", In D·A·CH Security. München, 9, 2017.

[Abstract] [BibTeX]

Abstract: Die Bedrohungslage bezüglich Cybervorfällen hat sich für kritische Infrastrukturen in den letzten Jahren
weiter verschärft. Die Regulierung versucht dem mit Vorgaben für große Betreiber wie dem IT-Sicherheitsgesetz gerecht zu werden. Kleine und mittlere Betreiber kritischer Infrastrukturen stehen damit vor
der doppelten Herausforderung, mangels gesetzlicher Anforderungen einerseits ihre eigene Lösung finden und zweitens dies mit ihrem begrenzten Budget an Finanzen und Personal stemmen zu müssen. Im
BMBF-geförderten Forschungsprojekt Aqua-IT-Lab wurde eine Methodik entwickelt, die kleinen und
mittleren Betreibern von Wasserver- und -entsorgungsanlagen erlaubt, die IT-Sicherheit ihrer Automatisierungstechnik mit begrenztem Aufwand und ohne tiefes Security-Fachwissen selbst abzuschätzen,
um so risikobasiert ressourcenschonend die wichtigsten Umsetzungsschritte planen zu können. Die Methodik lässt sich zudem auf andere Sektoren wie Energie übertragen.

BibTeX:

@inproceedings{DACH2017,
  author = {Jörg Schneider and David Fuhr and Edgar Korte and Christof Thim},
  title = {Security-Self-Assessment in kritischen Infrastrukturen},
  booktitle = {D·A·CH Security},
  year = {2017}
}

Linnert B, Schneider J and Burchard L-O (2014), "Mapping Algorithms Optimizing the Overall Manhattan Distance for pre-occupied Cluster Computers in SLA-based Grid environments", In 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014)., 5, 2014. IEEE CS Press.

[Abstract] [BibTeX]

Abstract: Grid applications are more and more widely used nowadays. One of the
major challenges is to provide a reliable and predictable platform
for computations of various kinds. In order to overcome this challenge,
Grid management systems such as the virtual resource manager (VRM)
implement scheduling and mapping algorithms at level of the local
management systems with support for resource reservation in advance.
In this paper, we examine three different mapping algorithms for
supercomputers and cluster systems with respect to execution time
and the achieveable performance regarding important metrics such
as overall Manhattan distance and achievable utilization. The results
show the importance of carefully implementing scheduling and mapping
algorithms in Grid environments.

BibTeX:

@inproceedings{Linnert2014a,
  author = {Barry Linnert and Jörg Schneider and Lars-Olof Burchard},
  title = {Mapping Algorithms Optimizing the Overall Manhattan Distance for pre-occupied Cluster Computers in SLA-based Grid environments},
  booktitle = {14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014)},
  publisher = {IEEE CS Press},
  year = {2014},
  note = {angenommen}
}

Pfeffer T, Herber P and Schneider J (2014), "Reverse Engineering of ARM Binaries Using Formal Transformations", In The 7th International Conference on Security of Information and Networks. Glasgow, UK, 9, 2014.

[Abstract] [BibTeX]

Abstract: Understanding the behavior of a program when no source code is available tends to be a complicated and time-expensive task. In this paper, we present a novel approach for reverse engineering of ARM binaries. The main idea is to translate the original assembler representation into a formal intermediate representation language, namely WSL, and then to apply rephrasing transformations to the code. To achieve a highly modular translation, we define a rule set to translate each assembler instruction individually. Furthermore, new rephrasing rules were developed to recover high level control flow aspects and to eliminate assembler specific program fragments in the intermediate code. We demonstrate the applicability of our approach through the successful recovery of high level control flow statements in the Debian coreutils binaries. Using these example binaries, we studied the performance and the quality of our transformation.

BibTeX:

@inproceedings{PHS2014,
  author = {Tobias Pfeffer and Paula Herber and Jörg Schneider},
  title = {Reverse Engineering of ARM Binaries Using Formal Transformations},
  booktitle = {The 7th International Conference on Security of Information and Networks},
  year = {2014}
}

Schneider J and Linnert B (2014), "List-based Data Structures for Efficient Management of Advance Reservations", International Journal of Parallel Programming., accepted., 2, 2014. Vol. 42(1), pp. 77-93. Springer.

[Abstract] [BibTeX] [URL]

Abstract: Complex eScience and other sophisticated applications in the field
of HPC imply new demands that queuing based resource management systems
cannot meet. To guarantee Quality of Service and co-allocation in
the Grid, planning based resource management systems implementing
advance reservation are needed. These systems face new challenges
as a planning based management system has to keep track of the jobs
and reservations in the future. Additionally, during the negotiation
process of incoming reservations, a good overview of the remaining,
not-yet reserved capacity is needed---not only for the current allocation,
but also for the whole book-ahead time. Therefore, the resource management
problem becomes a two dimensional problem for advance reservations
in this field.

In this paper different data structures are investigated and discussed
in order to fit to planning based resource management. As a result
the benefits of using lists of resource allocation or free blocks
are exposed. This general idea widely used to manage continuous resources
is extended to cover not only the resource dimension but also the
time dimension. The list of blocks approach is evaluated in a Grid
level and a local resource management system for a computing cluster.
The extensive simulations showed a better runtime and higher reservation
success rate compared with the currently favored approach of a slotted
time and the more sophisticated approach based on AVL trees.

BibTeX:

@article{Schneider2012,
  author = {Jörg Schneider and Barry Linnert},
  editor = {Utpal Banerjee and Nicholas Carriero and Alexandru Nicolau},
  title = {List-based Data Structures for Efficient Management of Advance Reservations},
  journal = {International Journal of Parallel Programming},
  publisher = {Springer},
  year = {2014},
  volume = {42},
  number = {1},
  pages = {77-93},
  url = {http://www.user.tu-berlin.de/komm/paper/2012-Schneider-Linnert-data-structures-for-adv.-reservation.pdf}
}

Lell J, Koch S and Schneider J (2013), "StackIDS - Catching Binary Exploits before they Execute a System Call", In Herbsttreffen der GI-Fachgruppe Betriebssysteme.

[BibTeX] [URL]

BibTeX:

@inproceedings{Lell2013,
  author = {Jakob Lell and Sebastian Koch and Jörg Schneider},
  title = {StackIDS - Catching Binary Exploits before they Execute a System Call},
  booktitle = {Herbsttreffen der GI-Fachgruppe Betriebssysteme},
  year = {2013},
  url = {http://www.betriebssysteme.org/Aktivitaeten/Treffen/2013-Berlin/Programm/docs/lell_koch_schneider-stackids.pdf}
}

Schepke C, Maillard N, Schneider J and Heiß H-U (2013), "Online Mesh Refinement for Parallel Atmospheric Models", International Journal of Parallel Programming., 8, 2013. Vol. 41(4), pp. 552-569. Springer.

[Abstract] [BibTeX] [URL]

Abstract: Forecast precisions of climatological models are limited by computing
power and time available for the executions. As more and faster processors
are used in the computation, the resolution of the mesh adopted to
represent the Earthâ??s atmosphere can be increased, and consequently
the numerical forecast is more accurate. However, a finer mesh resolution,
able to include local phenomena in a global atmosphere integration,
is still not possible due to the large number of data elements to
compute in this case. To overcome this situation, different mesh
refinement levels can be used at the same time for different areas
of the domain. Thus, our paper evaluates how mesh refinement at run
time (online) can improve performance for climatological models.The
online mesh refinement (OMR) increases dynamically mesh resolution
in parts of a domain,when special atmosphere conditions are registered
during the execution. Experimental results show that the execution
of a model improved by OMR provides better resolution for the meshes,
without any significant increase of execution time. The parallel
performance of the simulations is also increased through the creation
of threads in order to explore different levels of parallelism.

BibTeX:

@article{Schepke2013,
  author = {Claudio Schepke and Nicolas Maillard and Jörg Schneider and Hans-Ulrich Heiß},
  title = {Online Mesh Refinement for Parallel Atmospheric Models},
  journal = {International Journal of Parallel Programming},
  publisher = {Springer},
  year = {2013},
  volume = {41},
  number = {4},
  pages = {552-569},
  url = {http://link.springer.com/article/10.1007/s10766-012-0235-4}
}

Koch S, Schneider J and Nordholz J (2012), "Disturbed playing: Another kind of educational security games", In 5th Workshop on Cyber Security Experimentation and Test at Usenix Security 2012. Seattle, US, 8, 2012. USENIX Association.

[Abstract] [BibTeX] [URL]

Abstract: Games have a long tradition in teaching IT security: Ranging from
international capture-the-flag competitions played by multiple teams
to educational simulation games where individual students can get
a feeling for the effects of security decisions. All these games
have in common, that the game's main goal is keeping up the security.
In this paper, we propose another kind of educational security games
which feature a game goal unrelated to IT security. However, during
the game session gradually more and more attacks on the underlying
infrastructure disturb the game play. Such a scenario is very close
to the reality of an IT security expert, where establishing security
is just a necessary requirement to reach the company's goals. By
preparing and analyzing the game sessions, the students learn how
to develop a security policy for a simplified scenario. Additionally,
the students learn to decide when to apply technical security measures,
when to establish emergency plans, and which risks cannot be covered
economically.

As an example for such a disturbed playing game, we present our distributed
air traffic control scenario. The game play is disturbed by attacking
the integrity and availability of the underlying network in a coordinated
manner, i.e., all student teams experience the same failures at the
same state of the game. Beside presenting the technical aspects of
the setup, we are also discussing the didactic approach and the experiences
made in the last years.

BibTeX:

@inproceedings{Koch2012,
  author = {Sebastian Koch and Jörg Schneider and Jan Nordholz},
  title = {Disturbed playing: Another kind of educational security games},
  booktitle = {5th Workshop on Cyber Security Experimentation and Test at Usenix Security 2012},
  publisher = {USENIX Association},
  year = {2012},
  url = {http://www.user.tu-berlin.de/komm/paper/2012-Schneider-Koch-Nordholz-Disturbed-Playing.pdf}
}

Schneider J (2012), "How do you know that your cloud operator does not cheat?", In Workshop on Service Science and Engineering., accepted. Shanghai, CN Springer.

[Abstract] [BibTeX]

Abstract: The security of a system is usually based on the physical security
of the hardware. In a Cloud setup, this basic assumption cannot be
assured as the system runs as a virtual machine (VM) on the operatorâ??s
hardware. The operator has access to all files, has access to the
main memory, can interfere with the communication, and can manipulate
the control flow. The Cloud operator can even hide manipulations
by creating a virtual view for the user. In the talk, I will show
how the security goals confidentiality, integrity, and availability
can be violated by the Cloud provider. The user may not be able to
prevent such manipulations, but can sign a service level agreement
(SLA) and negotiate fines to be paid. For the Cloud operator, the
manipulations are no longer lucrative if the risk to be discovered
and the fine is high enough. However, a mechanism is needed to detect
an attack reliably to enforce the SLA. I will present such detection
mechanisms for various attack types and analyze how a bogus Cloud
operator may still avoid the detection.

BibTeX:

@inproceedings{Schneider2012a,
  author = {Jörg Schneider},
  title = {How do you know that your cloud operator does not cheat?},
  booktitle = {Workshop on Service Science and Engineering},
  publisher = {Springer},
  year = {2012}
}

Schepke C, Maillard N, Schneider J and Heiß H-U (2011), "Why Online Dynamic Mesh Refinement is Better for Parallel Climatological Models?", In 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Vitoria, BR, 10, 2011. , pp. 168 - 175. IEEE Computer Society.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: Forecast precisions of climatological models are limited by computing
power and time available for the executions. As more and faster processors
are used in the computation, the resolution of the mesh adopted to
represent the Earth's atmosphere can be increased, and consequently
the numerical forecast is more accurate and shows local phenomena.
However, a finer mesh resolution, able to include local phenomena
in a global atmosphere integration, is still not possible. To overcome
this situation, different mesh refinement levels can be used at the
same time for different areas. In this context, this paper evaluates
how mesh refinement at run time can improve performance for climatological
models. In order to contribute with this analysis, an online dynamic
mesh refinement was developed. It increases mesh resolution in parts
of a parallel distributed model, when special atmosphere conditions
are registered during the execution. The results show that the parallel
execution of this improvement provides better resolution for the
meshes, without a significant increase of execution time.

BibTeX:

@inproceedings{schepke2011a,
  author = {Claudio Schepke and Nicolas Maillard and Jörg Schneider and Hans-Ulrich Heiß},
  title = {Why Online Dynamic Mesh Refinement is Better for Parallel Climatological Models?},
  booktitle = {23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)},
  publisher = {IEEE Computer Society},
  year = {2011},
  pages = {168 - 175},
  url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6106019},
  doi = {10.1109/SBAC-PAD.2011.14}
}

Schepke C, Maillard N, Schneider J and Heiß H-U (2011), "Online Mesh Refinement in Parallel Meteorological Applications", In Proceedings of Latin American Conference on High Performance Computing (CLCAR).

[BibTeX]

BibTeX:

@inproceedings{schepke2011,
  author = {Claudio Schepke and Nicolas Maillard and Jörg Schneider and Hans-Ulrich Heiß},
  title = {Online Mesh Refinement in Parallel Meteorological Applications},
  booktitle = {Proceedings of Latin American Conference on High Performance Computing (CLCAR)},
  year = {2011}
}

Schneider J and Linnert B (2011), "Efficiently Managing Advance Reservations Using Lists of Free Blocks", In 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). , pp. 183-190.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: Advance reservation was identified as a key technology to enable guaranteed
Quality of Service and co-allocation in the Grid. Nonetheless, most
Grid and local resource management systems still use the queuing
approach because of the additional complexity introduced by advance
reservation. A planning based resource management system has to keep
track of the reservations in the future and needs a good overview
on the available capacity during the negotiation of incoming reservations.
For advance reservation, the resource management problem becomes
a two dimensional problem. In this paper different data structures
are investigated and discussed in order to fit to planning based
resource management. As a result the benefits of using lists of resource
allocation or free blocks are exposed. This general idea widely used
to manage continuous resources is extended to cover not only the
resource dimension but also the time dimension. The list of blocks
approach is evaluated in a Grid level and a resource level resource
management system. The extensive simulations showed a better runtime
and higher reservation success rate compared with the currently favored
approach of a slotted time.

BibTeX:

@inproceedings{schneider2011,
  author = {Jörg Schneider and Barry Linnert},
  title = {Efficiently Managing Advance Reservations Using Lists of Free Blocks},
  booktitle = {23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)},
  year = {2011},
  pages = {183-190},
  url = {http://www.user.tu-berlin.de/komm/paper/2011-Schneider-Linnert-Managing-Adv.-Reservations.pdf},
  doi = {10.1109/SBAC-PAD.2011.25}
}

(2011), "INFORMATIK 2011 - Informatik schafft Communities" Bonn, 10, 2011. (192) Köllen Verlag.

[Abstract] [BibTeX] [URL]

Abstract: INFORMATIK 2011 is the 41th annual conference of the Gesellschaft
für Informatik e.V. (GI). The topic of this year?s conference is
?Computer Science creates Communities? and it?s not only about virtual
communities, but also real ones like the scientific community: How
can we improve the networking within the computer science community
and the connections to politics, industry, and society. How do we
use new social media within the scientific communities? But the conference
is also about research regarding the new technologies helping communities:
From online social networks to software support for huge events or
traffic guidance systems.

BibTeX:

@proceedings{Informatik2011,,
  editor = {Hans-Ulrich Heiß and Peter Pepper and Holger Schlingloff and Jörg Schneider},
  title = {INFORMATIK 2011 - Informatik schafft Communities},
  publisher = {Köllen Verlag},
  year = {2011},
  number = {192},
  url = {http://www.user.tu-berlin.de/komm/CD/html/index.html}
}

Diener M, Madruga FL, Rodrigues ER, Alves MAZ, Schneider J, Navaux POA and Heiß H-U (2010), "Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors", In Proceedings of 12th IEEE International Conference on High Performance Computing and Communications (HPCC-2010). , pp. 491-496.

[Abstract] [BibTeX]

Abstract: Process placement is a technique widely used on parallel machines
with heterogeneous interconnections to reduce the overall communication
time. For instance, two processes which communicate frequently are
mapped close to each other. Finding the optimal mapping between threads
and cores in a shared-memory environment (for example, OpenMP and
Pthreads) is an even more complex task due to implicit communication.
In this work, we examine data sharing patterns between threads in
dierent workloads and use those patterns in a similar way as messages
are used to map processes in cluster computers. We evaluated our
technique on two state-of-the-art multi-core processors and achieved
moderate improvements in the common case and considerable improvements
in some cases, reducing execution time by up to 45%.

BibTeX:

@inproceedings{Diener2010,
  author = {Matthias Diener and Felipe L. Madruga and Eduardo R. Rodrigues and Marco A. Z. Alves and Jörg Schneider and Philippe O. A. Navaux and Hans-Ulrich Heiß},
  editor = {Guerrero, Juan E},
  title = {Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors},
  booktitle = {Proceedings of 12th IEEE International Conference on High Performance Computing and Communications (HPCC-2010)},
  year = {2010},
  pages = {491--496}
}

Dragiev S and Schneider J (2010), "Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem", In Proceedings of 2010 IEEE Conference on Open Systems (ICOS2010). Kuala Lumpur, 10, 2010. , pp. 74-79.

[Abstract] [BibTeX] [DOI]

Abstract: With service level agreements (SLAs) the Grid broker guarantees to
finish the Grid jobs by a given deadline. There are a number of approaches,
to plan reservations to fulfil these deadline requirements and to
handle currently running jobs in the case of a resource failure.
However, there is a lack of strategies to handle the already planned
but not yet started jobs. These jobs will be most likely also affected
by the resource failure and can be remapped to other resources well
in advance. Complex Grid jobs (Grid workflows) consisting of multiple
sub-jobs introduce a higher complexity to determine a remapping saving
as much Grid jobs as possible. In this paper a recovery scheme for
Grid workflows using a dynamic constraint solver is presented and
the gain in the number of saved Grid jobs is evaluated using extensive
simulations.

BibTeX:

@inproceedings{Dragiev2010,
  author = {Stanimir Dragiev and Jörg Schneider},
  title = {Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem},
  booktitle = {Proceedings of 2010 IEEE Conference on Open Systems (ICOS2010)},
  year = {2010},
  pages = {74-79},
  doi = {10.1109/ICOS.2010.5720067}
}

Gasmi Y and Schneider J (2010), "E-Mail Security as Cooperation Problem", In Proceedings of Workshop on Systems Communication and Engineering in Computer Science.

[BibTeX]

BibTeX:

@inproceedings{gasmi2010,
  author = {Yacine Gasmi and Jörg Schneider},
  title = {E-Mail Security as Cooperation Problem},
  booktitle = {Proceedings of Workshop on Systems Communication and Engineering in Computer Science},
  year = {2010}
}

Köppe F and Schneider J (2010), "Do you get what you pay for? Using Proof-of-Work Functions to Verify Performance Assertions in the Cloud", In Proceedings of International Workshop on Cloud Privacy, Security, Risk & Trust (CPSRT 2010). Indianapolis , pp. 687. IEEE CS Press.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: In the Cloud, the operators usually offer resources on a pay per use
price model. The client gets access to a newly created virtual machine
and has no direct access to the underlying hardware. Therefore, the
client cannot verify whether the Cloud operator provides the negotiated
amount of resources or only a fraction thereof. Especially, the assigned
share of CPU time can be easily forged by the operator. The client
could use a normal benchmark to verify the performance of his virtual
machine. However, as the Cloud operator owns the underlying infrastructure,
the operator could also tamper with the benchmark execution. We identified
four attack vectors to modify the results of the benchmark. Based
on these attack vectors, we showed that using proof-of-work functions
can disable three of them. Proof-of-work functions are challenge
response systems, where it is simple to generate a challenge and
verify the result while solving the challenge is compute intensive.
We implemented three proof-of-work functions in a prototype benchmark.
Experiments showed that the runtime of the proof-of-work functions
sufficiently relates to the results of the reference benchmark suite
SPEC CPU2006.

BibTeX:

@inproceedings{Koeppe2010,
  author = {Falk Köppe and Jörg Schneider},
  title = {Do you get what you pay for? Using Proof-of-Work Functions to Verify Performance Assertions in the Cloud},
  booktitle = {Proceedings of International Workshop on Cloud Privacy, Security, Risk & Trust (CPSRT 2010)},
  publisher = {IEEE CS Press},
  year = {2010},
  pages = {687},
  url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5708518},
  doi = {10.1109/CloudCom.2010.100}
}

Schneider J and Koch S (2010), "HTTPreject: Handling Overload Situations without Losing the Contact to the User", In Proceedings of European Conference on Computer Network Defense (EC2ND 2010). , pp. 29-34.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: The web is a crucial source of information nowadays. At the same time,
web applications become more and more complex. Therefore, a spontaneous
increase in the number of visitors, e.g., based on news reports or
events, easily brings a web server in an overload situation. In contrast
to the classical model of distributed denial of service (DDoS) attacks,
such a so-called flash effect situation is not triggered by a bulk
of bots just aiming at hurting the system but by humans with a high
interest in the content of the web site itself. While the bots do
not stop their attack until told so by their operator, the user try
repeatedly to access the site without knowing that the repeated reloads
effectively increase the web server's overload. Classical approaches
try to distinguish between real user and harmful requests, which
is not applicable in this scenario. Simply restricting the number
of connections leads to very technical error messages displayed by
the users' client software if at all. Therefore, we propose a mean
to efficiently block connection attempts and to keep the user informed
at the same time. A small subset of HTTP and TCP is statelessly implemented
to display simple busy messages or relevant news updates to the end
user with only few resources. In this paper we present the protocol
subset used and discuss the compatibility problems on the protocol
and client software level. Furthermore, we show the results of performance
experiments using a prototype implementation.

BibTeX:

@inproceedings{Schneider2010,
  author = {Jörg Schneider and Sebastian Koch},
  title = {HTTPreject: Handling Overload Situations without Losing the Contact to the User},
  booktitle = {Proceedings of European Conference on Computer Network Defense (EC2ND 2010)},
  year = {2010},
  pages = {29-34},
  url = {http://www.user.tu-berlin.de/komm/paper/2010-schneider-koch-HTTPreject.pdf},
  doi = {10.1109/EC2ND.2010.7}
}

Dragiev S and Schneider J (2009), "Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem", In Proceedings of 23. PARS Workshop. Parsberg

[Abstract] [BibTeX]

Abstract: With service level agreements (SLAs) the Grid broker guarantees to

finish the Grid jobs by a given deadline. There are a number of

approaches, to plan reservations to fulfil these deadline requirements

and to handle currently running jobs in the case of a resource failure.

However, there is a lack of strategies to handle the already planned
but

not yet started jobs. These jobs will be most likely also affected
by

the resource failure and can be remapped to other resources well in

advance. Complex Grid jobs (Grid workflows) consisting of multiple

sub-jobs introduce a higher complexity to determine a remapping saving

as much Grid jobs as possible. In this paper a recovery scheme for
Grid

workflows using a dynamic constraint solver is presented and the gain
in

the number of saved Grid jobs is evaluated using extensive simulations.

BibTeX:

@inproceedings{Dragiev2009,
  author = {Stanimir Dragiev and Jörg Schneider},
  title = {Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem},
  booktitle = {Proceedings of 23. PARS Workshop},
  year = {2009}
}

Gehr J and Schneider J (2009), "Measuring Fragmentation of Two-Dimensional Resources Applied to Advance Reservation Grid Scheduling", In Proceedings of 9th International Symposium on Cluster Computing and the Grid (CCGrid 09). Shanghai, 5, 2009. , pp. 276-283.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: Whenever a resource allocation fails although enough free capacity
being available, fragmentation is easily spotted as cause. But how
the fragmentation in a system requiring continuous allocations like
time schedules or memory can be quantified is hardly analyzed. A
Grid environment using advance reservation even combines two-dimensions:
time and resource dimension. In this paper a new way to measure the
fragmentation of a system in one dimension is proposed. This measure
is then extended to incorporate also the second dimension. Extensive
simulations showed that the proposed fragmentation measure is a good
indicator of the state of the system.

BibTeX:

@inproceedings{Gehr2009,
  author = {Julius Gehr and Jörg Schneider},
  title = {Measuring Fragmentation of Two-Dimensional Resources Applied to Advance Reservation Grid Scheduling},
  booktitle = {Proceedings of 9th International Symposium on Cluster Computing and the Grid (CCGrid 09)},
  year = {2009},
  pages = {276-283},
  url = {http://www.user.tu-berlin.de/komm/paper/2009-measure-2D-fragmentation.pdf},
  doi = {10.1109/CCGRID.2009.81}
}

Schneider J, Gehr J, Heiß H-U, Ferreto T, Rose CD, Righi R, Rodrigues ER, Maillard N and Navaux P (2009), "Design of a Grid workflow for a climate application", In Proceedings of IEEE Symposium on Computers and Communications (ISCC'09). , pp. 793.

[Abstract] [BibTeX] [DOI]

Abstract: Grid applications can be modeled as a composition of rather independent
tasks.

There are two approaches to define such a workflow either by combining
multiple applications to build a more complex functionality or by
splitting up an existing application.

In this paper we analyze the latter process.

We present a compute intensive application for climatology simulation
and the options available to split it up.

Using the simulation mode of our Grid broker, we were able to compare
the different workflow specifications before actually executing the
workflows.

This case study showed, using finer grained workflows--which usually
need more adjustments to the software--allows better performance
in the Grid.

BibTeX:

@inproceedings{Schneider2009,
  author = {Jörg Schneider and Julius Gehr and Hans-Ulrich Heiß and Tiago Ferreto and César De Rose and Rodrigo Righi and Eduardo R. Rodrigues and Nicolas Maillard and Philippe Navaux},
  title = {Design of a Grid workflow for a climate application},
  booktitle = {Proceedings of IEEE Symposium on Computers and Communications (ISCC'09)},
  year = {2009},
  pages = {793},
  doi = {10.1109/ISCC.2009.5202233}
}

Burchard L-O, Heiß H-U, Linnert B, Schneider J and Rose CAD (2008), "VRM: A failure-aware Grid resource management system", International Journal of High Performance Computing and Networking. Vol. 5(4), pp. 215-226.

[Abstract] [BibTeX] [DOI]

Abstract: For resource management in Grid environments, advance reservations
turned out to be very useful and hence are supported by a variety
of Grid toolkits. However, failure recovery for such systems has
not yet received the attention it deserves. In this paper, we address
the problem of remapping reservations to other resources, when the
originally selected resource fails. Instead of dealing with jobs
already running, which usually means checkpointing and migration,
our focus is on jobs that are scheduled on the failed resource for
a specific future period of time but not started yet. The most critical
factor when solving this problem is the estimation of the downtime.
We avoid the drawbacks of under- or over-estimating the downtime
by a dynamic load-based approach that is evaluated by extensive simulations
in a Grid environment and shows superior performance compared to
estimation-based approaches.

BibTeX:

@article{Burchard2008,
  author = {Lars-Olof Burchard and Hans-Ulrich Heiß and Barry Linnert and Jörg Schneider and Cesar A.F. De Rose},
  title = {VRM: A failure-aware Grid resource management system},
  journal = {International Journal of High Performance Computing and Networking},
  year = {2008},
  volume = {5},
  number = {4},
  pages = {215-226},
  doi = {10.1504/IJHPCN.2008.022298}
}

Schneider J, Gehr J, Linnert B and Röblitz T (2008), "An Efficient Protocol for Reserving Multiple Grid Resources in Advance", In Grid and Services Evolution (Proceedings of the 3rd CoreGRID Workshop on Grid Middleware). , pp. 189-204. Springer.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: We propose a mechanism for the co-allocation of multiple resources
in Grid

environments. By reserving multiple resources in advance, scientific
simulations

and large-scale data analyses can efficiently be executed with their
desired

quality-of-service level. Co-allocating multiple Grid resources in
advance poses

demanding challenges due to the characteristics of Grid environments,
which

are (1) incomplete status information, (2) dynamic behavior of resources
and

users, and (3) autonomous resources? management systems. Our co-reservation

mechanism addresses these challenges by probing the state of the resources
and

by enhancing a two-phase commit protocol with timeouts. We performed
extensive

simulations to evaluate communication overhead of the new protocol
and

the impact of the timeouts? length on the scheduling of jobs as well
as on the

utilization of the Grid resources.

Keywords: Grid resource management, advance

BibTeX:

@inproceedings{Schneider2008,
  author = {Jörg Schneider and Julius Gehr and Barry Linnert and Thomas Röblitz},
  title = {An Efficient Protocol for Reserving Multiple Grid Resources in Advance},
  booktitle = {Grid and Services Evolution (Proceedings of the 3rd CoreGRID Workshop on Grid Middleware)},
  publisher = {Springer},
  year = {2008},
  pages = {189-204},
  url = {http://www.user.tu-berlin.de/komm/paper/2008-efficient-protocol-advance-reservation.pdf},
  doi = {10.1007/978-0-387-85966-8_14}
}

Bergmann A, Schneider J and Heiß H-U (2007), "Behandlung offener Netzwerkverbindungen bei Prozessmigration", In Proceedings of 21. PARS Workshop. Hamburg

[BibTeX]

BibTeX:

@inproceedings{Bergmann2007,
  author = {Andreas Bergmann and Jörg Schneider and Hans-Ulrich Heiß},
  title = {Behandlung offener Netzwerkverbindungen bei Prozessmigration},
  booktitle = {Proceedings of 21. PARS Workshop},
  year = {2007}
}

Decker J and Schneider J (2007), "Heuristic Scheduling of Grid Workflows Supporting Co-Allocation and Advance Reservation", In 7th Intl. IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid07). Rio de Janeiro, Brazil, 5, 2007. , pp. 335-342. IEEE CS Press.

[Abstract] [BibTeX] [URL]

Abstract: Applications to be executed in Grid computing environments become
more and more complex and usually consist of multiple interdependent
tasks. The coordinated execution of such tightly or loosely coupled
tasks often requires simultaneous access to different Grid resources.
This leads to the problem of resource co-allocation. Efficient and
robust scheduling algorithms have to be developed that can cope with
the Grid's large-scale distribution, a high number of competing and
demanding applications, the inherent resource heterogeneity and the
often limited view on resource availability. In this paper, we present
two heuristic scheduling algorithms that are based on a well-known
list scheduling algorithm and both support co-allocation and advance
resource reservation. Our first algorithm preserves the run-time
efficiency of Greedy list schedulers while the second approach incorporates
more sophisticated search techniques in order to achieve better results
with respect to the performance metrics. Both algorithms have been
implemented within a Grid simulation framework. An extensive simulation
study was conducted to evaluate and compare the performance of both
algorithms. It showed the general suitability of our enhanced list
scheduling heuristics within heterogeneous Grid environments.

BibTeX:

@inproceedings{Decker2007,
  author = {Jörg Decker and Jörg Schneider},
  editor = {Bruno Schulz and Rajkumma Buyya and Philippe Navaux and Walfredo Cirne and Vinod Rebello},
  title = {Heuristic Scheduling of Grid Workflows Supporting Co-Allocation and Advance Reservation},
  booktitle = {7th Intl. IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid07)},
  publisher = {IEEE CS Press},
  year = {2007},
  pages = {335--342},
  url = {http://www.kbs.cs.tu-berlin.de/publications/fulltext/decker-heuristicWorkflow.pdf}
}

Burchard L-O, Heiß H-U, Linnert B, Schneider J, Kao O, Hovestadt M, Heine F and Keller A (2006), "The Virtual Resource Manager: Local Autonomy versus QoS Guarantees for Grid Applications", In Future Generation Grids. Vol. 2

[Abstract] [BibTeX] [DOI] [URL]

Abstract: In this paper, we describe the architecture of the virtual resource
manager VRM, a management system designed to reside on top of local
resource management systems for cluster computers and other kinds
of resources. The most important feature of the VRM is its capability
to handle quality-of-service (QoS) guarantees and service-level agreements
(SLAs). The particular emphasis of the paper is on the various opportunities
to deal with local autonomy for resource management systems not supporting
SLAs. As local administrators may not want to hand over complete
control to the Grid management, it is necessary to define strategies
that deal with this issue. Local autonomy should be retained as much
as possible while providing reliability and QoS guarantees for Grid
applications, e.g., specified as SLAs.

BibTeX:

@inproceedings{Burchard2006,
  author = {Lars-Olof Burchard and Hans-Ulrich Heiß and Barry Linnert and Jörg Schneider and Odej Kao and Matthias Hovestadt and Felix Heine and Axel Keller},
  editor = {Getov, Vladimir and Laforenza, Domenico and Reinefeld, Alexander},
  title = {The Virtual Resource Manager: Local Autonomy versus QoS Guarantees for Grid Applications},
  booktitle = {Future Generation Grids},
  year = {2006},
  volume = {2},
  url = {http://www.user.tu-berlin.de/komm/paper/FGG-local-autonomy-vs-SLA.pdf},
  doi = {http://www.springerlink.com/content/m50g77l430705x03/}
}

Schneider J, Linnert B and Burchard L-O (2006), "Distributed Workflow Management for Large-Scale Grid Environments", In IEEE/IPSJ International Symposium on Applications and the Internet (SAINT 2006). Phoenix, Arizona, USA, 1, 2006. , pp. 229-235. IEEE Computer Society Press.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: Workflow management in large-scale Grid environments is a very challenging
task centralized management systems are not able to cover sufficiently.
Therefore, we present our Workflow On-line Resource Management (WORM)
architecture built on top of active network technology. The approach
integrates a peer-to-peer like organized workflow management system
with existing or newly built management systems for the resources
building the Grid. In our approach, each workflow is represented
by a mobile autonomous entity which uses the active network infrastructure
to move through the Grid, which is represented by an active overlay
network on top of existing network infrastructure. Thus, control
of the workflow execution is handed over to the autonomous code without
requiring a central system to be in charge of the computation and
cope with reservation, failures, etc. The WORM architecture is presented
together with a classification into the taxonomy of workflow management
systems.

BibTeX:

@inproceedings{BurchardEtAl-2006-Large-Scale-Workflow,
  author = {Jörg Schneider and Barry Linnert and Lars-Olof Burchard},
  title = {Distributed Workflow Management for Large-Scale Grid Environments},
  booktitle = {IEEE/IPSJ International Symposium on Applications and the Internet (SAINT 2006)},
  publisher = {IEEE Computer Society Press},
  year = {2006},
  pages = {229--235},
  url = {http://www.user.tu-berlin.de/komm/paper/SAINT06-Distributed-workflow.pdf},
  doi = {10.1109/SAINT.2006.25}
}

Burchard L-O, Linnert B and Schneider J (2005), "A Distributed Load-Based Failure Recovery Mechanism for Advance Reservation Environments", In 5th ACM/IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid)., 5, 2005. Vol. 2, pp. 1071-1078.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: Resource reservations in advance are a mature concept for theallocation
of various resources, particularly in Grid environments.Common Grid
tool kits support advance reservations and assign jobs toresources
at admission time. In such a distributed environment, it isnecessary
to develop carefully tailored failure recovery mechanismsthat provide
seamless transparent migration of jobs from one resourceto another.
As the migration of running jobs is difficult, animportant issue
in advance reservation, i.e., planning based,management infrastructures
is to determine the duration of a failurein order to remap jobs that
are already allocated to a currentlyfailed resource but not yet active.
As shown in previous work,underestimations of the failure duration
and as a consequence theremapping of too few jobs results in an increased
amount of jobterminations. In order to overcome this drawback, in
this paper wepropose a load-based computation of the jobs to be remapped.
Acentralized and a distributed version of the strategy are presented,showing
it is not necessary to have knowledge beyond the localallocation
on the failed resource. These load-based strategies achieveeffective
remapping of jobs while avoiding - inevitably inaccurate -estimations
of the failure duration.

BibTeX:

@inproceedings{Burchard2005b,
  author = {Lars-Olof Burchard and Barry Linnert and Jörg Schneider},
  title = {A Distributed Load-Based Failure Recovery Mechanism for Advance Reservation Environments},
  booktitle = {5th ACM/IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid)},
  year = {2005},
  volume = {2},
  pages = {1071-1078},
  url = {http://www.user.tu-berlin.de/komm/paper/CCGrid05-load-based-failure-recovery.pdf},
  doi = {10.1109/CCGRID.2005.1558679}
}

Burchard L-O, Rose CAFD, Heiß H-U, Linnert B and Schneider J (2005), "VRM: A Failure-Aware Grid Resource Management System", In Proceedings of the 17th International Symposium on Computer Architecture and High Performance Computing., 10, 2005. , pp. 218-225. IEEE press.

[Abstract] [BibTeX] [URL]

Abstract: For resource management in Grid environments, advance reservations
turned out to be very useful and hence are supported by a variety
of Grid toolkits. However, failure recovery for such systems has
not yet received the attention it deserves. In this paper, we address
the problem of remapping reservations to other resources, when the
originally selected resource fails. Instead of dealing with jobs
already running, which usually means checkpointing and migration,
our focus is on jobs that are scheduled on the failed resource for
a specific future period of time but not started yet. The most critical
factor when solving this problem is the estimation of the downtime.
We avoid the drawbacks of under- or overestimating the downtime by
a dynamic load-based approach that is evaluated by extensive simulations
in a Grid environment and shows superior performance compared to
estimation-based approaches.

BibTeX:

@inproceedings{Burchard2005c,
  author = {Lars-Olof Burchard and Cesar A. F. De Rose and Hans-Ulrich Heiß and Barry Linnert and Jörg Schneider},
  title = {VRM: A Failure-Aware Grid Resource Management System},
  booktitle = {Proceedings of the 17th International Symposium on Computer Architecture and High Performance Computing},
  publisher = {IEEE press},
  year = {2005},
  pages = {218--225},
  url = {http://www.kbs.cs.tu-berlin.de/publications/fulltext/BurchardEtAl-2005-VRM.pdf}
}

Burchard L-O, Schneider J and Linnert B (2005), "Rerouting Strategies for Networks with Advance Reservations", In First IEEE International Conference on e-Science and Grid Computing (e-Science 2005). Melbourne, Australia, 12, 2005. , pp. 446-453. IEEE CS Press.

[Abstract] [BibTeX] [DOI] [URL]

Abstract: Network transmissions in high performance networking scenarios, e.g.,
used for e-science or Grid applications, require quality-of-service
guarantees concerning bandwidth availability, but also timing constraints,
e.g., deadlines, must be met. Current research efforts concentrate
on supporting such environments with SLA-aware advance reservation
management systems. Hence, the robustness of the management system
against network failures is an important issue, especially since
failures frequently occur in networks. Since accurate knowledge about
the failure duration is unlikely available and estimations lead to
considerably degraded performance, in this paper we present a novel
load-based approach for dealing with link failures in advance reservation
environments. The approach does not rely on prediction of the downtime,
but instead reroutes flows only based on available information about
the network.

BibTeX:

@inproceedings{Burchard2005a,
  author = {Lars-Olof Burchard and Jörg Schneider and Barry Linnert},
  title = {Rerouting Strategies for Networks with Advance Reservations},
  booktitle = {First IEEE International Conference on e-Science and Grid Computing (e-Science 2005)},
  publisher = {IEEE CS Press},
  year = {2005},
  pages = {446-453},
  url = {http://www.user.tu-berlin.de/komm/paper/eScience05-rerouting-of-advance-reservations.pdf},
  doi = {10.1109/E-SCIENCE.2005.71}
}

Burchard L-O, Schneider J and Linnert B (2005), "Distributed Workflow Management", In Mitteilungen - Gesellschaft für Informatik e. V., Parallel-Algorithmen und Rechnerstrukturen., 12, 2005. Gesellschaft für Informatik e.V..

[BibTeX]

BibTeX:

@inproceedings{BurchardEtAl-2005-WORM2,
  author = {Lars-Olof Burchard and Jörg Schneider and Barry Linnert},
  title = {Distributed Workflow Management},
  booktitle = {Mitteilungen - Gesellschaft für Informatik e. V., Parallel-Algorithmen und Rechnerstrukturen},
  publisher = {Gesellschaft für Informatik e.V.},
  year = {2005}
}

Burchard L-O, Linnert B, Heiß H-U and Schneider J (2004), "Resource Co-Allocation in Grid Environments", In Synergies between Information and Automation: 49. Internationales Wissenschaftliches Kolloquium. Shaker.

[Abstract] [BibTeX]

Abstract: The co-allocation of different resources is an essential functionality
of resource managementsystems in distributed environments in order
to assure deterministic behaviour of thesystem, e.g., for quality-of-service
(QoS) guarantees. For example, parallel programs requirethe allocation
of resources on several processors. In grid computing environments,
the resourcemanagement system needs to fullfil more complex tasks.
As grid computing covers alarge variety of different resources and
resource types, a job submitted to the Grid may consistof many different
sub-jobs which must be accomplished in a coordinated manner in orderto
obtain the desired result. For this purpose, guarantees may be given
in this case for thecompletion time, e.g., specified as service level
agreements (SLA). Besides other tasks, suchas identification and
discovery of suitable resources in the Grid, a critical task for
the resourcemanagement in such a case is to allocate all of the different
resources needed to comply withan SLA. In this paper, the concept
of malleable requests for co-allocation is introduced, whichallows
a reliable reservation with guaranteed QoS as well as enhanced flexibility
for clientsand operators.

BibTeX:

@inproceedings{Burchard2004c,
  author = {Lars-Olof Burchard and Barry Linnert and Hans-Ulrich Heiß and Jörg Schneider},
  title = {Resource Co-Allocation in Grid Environments},
  booktitle = {Synergies between Information and Automation: 49. Internationales Wissenschaftliches Kolloquium},
  publisher = {Shaker},
  year = {2004}
}