Components of the infrastructure need resources in order to function. Resources may consist of processes (or execution contexts), communication hardware, or other shared system functionality. The resource manager is responsible for coordinating interaction between components that require resources, and the control and allocation of those resources by the target system.
Activities that are provided by the resource manager include:
- resource discovery
- allocation and deallocation of resources
- resource usage information
Since resource allocation by the target system can take place by a variety of mechanisms, the resource manager attempts to provide an abstraction of these services. All interaction with external job schedulers and runtime systems is undertaken by the resource manager.
The resource manager assumes the following:
- notification of successful/unsuccessful allocation is asynchronous
- resources can be named, and there a number of predefined resource names
- in addition to the name of the resource, an allocation can also take an arbitrary number of attributes that further specify the allocation request (e.g. a process allocation request might take "number of processes" as an attribute)
- allocation may be partly successful, completely successful, or unsuccessful
- resources should be released when they are no longer needed
- the allocated resources may become unavailable at any point
The following default resources and attributes will always be available:
Name | Description | Attributes |
---|---|---|
process | Allocation of computational resource | number, location, category |
Resource Discovery
Resource Allocation Request
Resource Deallocation
Resource Usage
Specific Resources
Processor allocation
This includes both allocating "compute" nodes, and "service" nodes
- Determine how the allocation is specified, e.g.:
- user supplied list of hosts and processor counts (no Scheduler is running) ; we are done
- Allocation size to be requested from a scheduler (e.g. PBS)
- User supplied description of actual nodes to be requested from a Scheduler (e.g. PBS)
- Make the request(s) to the Scheduler
- Create handle(s) referencing the request(s)
- Wait for request(s) to be satisfied, or to time out
- Return handle(s) (or failure(s))
Compute node requests and service node requests may need to be obtained in separate requests.
