gateway_provisioners package#

Subpackages#

Submodules#

Mixin for configuration options on RemoteProvisionerBase.

class gateway_provisioners.config_mixin.RemoteProvisionerConfigMixin(**kwargs)#

Bases: Configurable

authorized_users#

List of user names against which KERNEL_USERNAME will be compared. Any match (case-sensitive) will allow the kernel’s launch, otherwise an HTTP 403 (Forbidden) error will be raised. The set of unauthorized users takes precedence. This option should be used carefully as it can dramatically limit who can launch kernels. To specify multiple names via the CLI, separate options must be provided for each entry. (GP_AUTHORIZED_USERS env var - non-bracketed, just comma-separated)

authorized_users_env = 'GP_AUTHORIZED_USERS'#
launch_timeout#

Number of ports to try if the specified port is not available (GP_LAUNCH_TIMEOUT env var)

launch_timeout_default_value = 30#
launch_timeout_env = 'GP_LAUNCH_TIMEOUT'#
port_range#

Specifies the lower and upper port numbers from which ports are created. The bounded values are separated by ‘..’ (e.g., 33245..34245 specifies a range of 1000 ports to be randomly selected). A range of zero (e.g., 33245..33245 or 0..0) disables port-range enforcement. (GP_PORT_RANGE env var)

port_range_default_value = '0..0'#
port_range_env = 'GP_PORT_RANGE'#
unauthorized_users#

List of user names against which KERNEL_USERNAME will be compared. Any match (case-sensitive) will prevent the kernel’s launch and result in an HTTP 403 (Forbidden) error. To specify multiple names via the CLI, separate options must be provided for each entry. (GP_UNAUTHORIZED_USERS env var - non-bracketed, just comma-separated)

unauthorized_users_default_value = 'root'#
unauthorized_users_env = 'GP_UNAUTHORIZED_USERS'#

Code related to managing kernels running in containers.

class gateway_provisioners.container.ContainerProvisionerBase(**kwargs: Any)#

Bases: RemoteProvisionerBase

Kernel provisioner for container-based kernels.

async confirm_remote_startup()#

Confirms the container has started and returned necessary connection information.

executor_image_name#

The image name to use as the Spark executor image when launching container-based kernels within Spark environments. (GP_EXECUTOR_IMAGE_NAME env var)

executor_image_name_env = 'GP_EXECUTOR_IMAGE_NAME'#
abstract get_container_status(iteration)#

Return current container state.

Return type:

str

abstract get_error_states()#

Returns the list of error states (in lowercase).

Return type:

set[str]

abstract get_initial_states()#

Return list of states (in lowercase) indicating container is starting (includes running).

Return type:

set[str]

async get_provisioner_info()#

Captures the base information necessary for kernel persistence relative to containers.

Return type:

dict[str, Any]

property has_process: bool#

Returns true if this provisioner is currently managing a process.

This property is asserted to be True immediately following a call to the provisioner’s launch_kernel() method.

image_name#

The image name to use when launching container-based kernels. (GP_IMAGE_NAME env var)

image_name_env = 'GP_IMAGE_NAME'#
async kill(restart=False)#

Kills a containerized kernel.

Return type:

None

async load_provisioner_info(provisioner_info)#

Loads the base information necessary for kernel persistence relative to containers.

Return type:

None

log_kernel_launch(cmd)#

Logs the kernel launch from the respective remote provisioner

Return type:

None

async poll()#

Determines if container is still active.

Submitting a new kernel to the container manager will take a while to be Running. Thus, kernel ID will probably not be available immediately for poll. So will regard the container as active when no status is available or one of the initial phases. Returns None if the container cannot be found or its in an initial state. Otherwise, return an exit code of 0.

Return type:

int | None

async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

dict[str, Any]

async send_signal(signum)#

Send signal signum to container.

Return type:

None

async shutdown_listener(restart)#

Sends a shutdown request to the kernel launcher listener.

Return type:

None

async terminate(restart=False)#

Terminates a containerized kernel.

This method defers to kill() since there’s no distinction between the two in these environments.

Return type:

None

abstract terminate_container_resources(restart=False)#

Terminate any artifacts created on behalf of the container’s lifetime.

Return type:

bool | None

Code related to managing kernels running based on k8s custom resource.

class gateway_provisioners.crd.CustomResourceProvisioner(**kwargs: Any)#

Bases: KubernetesProvisioner

A custom resource provisioner.

delete_managed_object(termination_stati)#

Deletes the object managed by this provisioner

A return value of True indicates the object is considered deleted, otherwise a False or None value is returned.

Note: the caller is responsible for handling exceptions.

Return type:

bool

get_container_status(iteration)#

Determines submitted CRD application status

Submitting a new kernel application CRD will take a while to reach the running state and the submission can also fail due to malformation or other issues which will prevent the application pod to reach the desired running state.

This function checks the CRD submission state and in case of success it then delegates to parent to check if the application pod is running.

Return type:

str

Returns:

  • Empty string if the container cannot be found otherwise.

  • The pod application status in case of success on Spark Operator side

  • Or the retrieved spark operator submission status in other cases (e.g. Failed)

get_initial_states()#

Return list of states in lowercase indicating container is starting (includes running).

Return type:

Set[str]

object_kind = 'CustomResourceDefinition'#
async pre_launch(**kwargs)#

Launch the process for a kernel.

Return type:

Dict[str, Any]

Code related to managing kernels running in YARN clusters.

class gateway_provisioners.distributed.DistributedProvisioner(**kwargs: Any)#

Bases: RemoteProvisionerBase

Kernel lifecycle management for clusters via ssh and a set of hosts.

async cleanup(restart=False)#

Cleanup any resources allocated on behalf of the kernel provisioner.

This method is called from KernelManager.cleanup_resources() as part of its shutdown kernel sequence.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

async confirm_remote_startup()#

Confirms the remote process has started and returned necessary connection information.

property has_process: bool#

Returns true if this provisioner is currently managing a process.

This property is asserted to be True immediately following a call to the provisioner’s launch_kernel() method.

host_index = 0#
kernel_on_host = <gateway_provisioners.distributed.TrackKernelOnHost object>#
async kill(restart=False)#

Kill the kernel process.

This is typically accomplished via a SIGKILL signal, which cannot be caught. This method is called from KernelManager.kill_kernel() when terminating a kernel immediately.

restart is True if this operation will precede a subsequent launch_kernel request.

Return type:

None

async launch_kernel(cmd, **kwargs)#

Launches a kernel process on a selected host.

NOTE: This overrides the superclass launch_kernel method entirely.

Return type:

Dict[str, Union[int, str, bytes]]

load_balancing_algorithm#

Specifies which load balancing algorithm DistributedProvisioner should use. Must be one of “round-robin” or “least-connection”. (GP_LOAD_BALANCING_ALGORITHM env var)

load_balancing_algorithm_default_value = 'round-robin'#
load_balancing_algorithm_env = 'GP_LOAD_BALANCING_ALGORITHM'#
log_kernel_launch(cmd)#

Logs the kernel launch from the respective remote provisioner

Return type:

None

async poll()#

Checks if kernel process is still running.

If running, None is returned, otherwise the process’s integer-valued exit code is returned. This method is called from KernelManager.is_alive().

Return type:

int | None

remote_hosts#

List of host names on which this kernel can be launched. Multiple entries must each be specified via separate options: –remote-hosts host1 –remote-hosts host2

remote_hosts_default_value = 'localhost'#
remote_hosts_env = 'GP_REMOTE_HOSTS'#
async terminate(restart=False)#

Terminates the kernel process.

This is typically accomplished via a SIGTERM signal, which can be caught, allowing the kernel provisioner to perform possible cleanup of resources. This method is called indirectly from KernelManager.finish_shutdown() during a kernel’s graceful termination.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

class gateway_provisioners.distributed.TrackKernelOnHost#

Bases: object

Class used to track the number of active kernels on the set of hosts so that the least-utilized host can be used for the next distributed request.

add_kernel_id(host, kernel_id)#
Return type:

None

decrement(host)#
Return type:

None

delete_kernel_id(kernel_id)#
Return type:

None

increment(host)#
Return type:

None

init_host_kernels(hosts)#
Return type:

None

min_or_remote_host(remote_host=None)#
Return type:

str

Code related to managing kernels running in docker-based containers.

class gateway_provisioners.docker_swarm.DockerProvisioner(**kwargs: Any)#

Bases: ContainerProvisionerBase

Kernel provisioner for kernels in Docker (non-Swarm).

get_container_status(iteration)#

Return current container state.

Return type:

str

get_error_states()#

Returns the list of error states (in lowercase).

Return type:

set[str]

get_initial_states()#

Return list of states (in lowercase) indicating container is starting (includes running).

Return type:

set[str]

async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

dict[str, Any]

terminate_container_resources(restart=False)#

Terminate any artifacts created on behalf of the container’s lifetime.

Return type:

bool

class gateway_provisioners.docker_swarm.DockerSwarmProvisioner(**kwargs: Any)#

Bases: ContainerProvisionerBase

Kernel provisioner for kernels in Docker Swarm.

get_container_status(iteration)#

Return current container state.

Return type:

str

get_error_states()#

Returns the list of error states (in lowercase).

Return type:

set[str]

get_initial_states()#

Return list of states (in lowercase) indicating container is starting (includes running).

Return type:

set[str]

async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

dict[str, Any]

terminate_container_resources(restart=False)#

Terminate any artifacts created on behalf of the container’s lifetime.

Return type:

bool | None

Code related to managing kernels running in Kubernetes clusters.

class gateway_provisioners.k8s.KubernetesProvisioner(**kwargs: Any)#

Bases: ContainerProvisionerBase

Kernel lifecycle management for Kubernetes kernels.

delete_managed_object(termination_stati)#

Deletes the object managed by this provisioner

A return value of True indicates the object is considered deleted, otherwise a False or None value is returned.

Note: the caller is responsible for handling exceptions.

Return type:

bool

get_container_status(iteration)#

Return current container state.

Return type:

str

get_error_states()#

Returns the list of error states (in lowercase).

Return type:

Set[str]

get_initial_states()#

Return list of states (in lowercase) indicating container is starting (includes running).

Return type:

Set[str]

async get_provisioner_info()#

Captures the base information necessary for kernel persistence relative to containers.

Return type:

Dict[str, Any]

async load_provisioner_info(provisioner_info)#

Loads the base information necessary for kernel persistence relative to containers.

Return type:

None

object_kind = 'Pod'#
async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

Dict[str, Any]

terminate_container_resources(restart=False)#

Terminate any artifacts created on behalf of the container’s lifetime.

Return type:

Optional[bool]

Kernel managers that operate against a remote process.

class gateway_provisioners.remote_provisioner.KernelChannel(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)#

Bases: Enum

Enumeration used to better manage tunneling

COMMUNICATION = 'GP_COMM'#
CONTROL = 'CONTROL'#
HEARTBEAT = 'HB'#
IOPUB = 'IOPUB'#
SHELL = 'SHELL'#
STDIN = 'STDIN'#
class gateway_provisioners.remote_provisioner.RemoteProvisionerBase(**kwargs: Any)#

Bases: RemoteProvisionerConfigMixin, KernelProvisionerBase

Base class for remote provisioners.

async cleanup(restart=False)#

Cleanup any resources allocated on behalf of the kernel provisioner.

This method is called from KernelManager.cleanup_resources() as part of its shutdown kernel sequence.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

abstract async confirm_remote_startup()#

Confirms the remote process has started and returned necessary connection information.

detect_launch_failure()#

Helper method called from implementations of confirm_remote_startup() that checks if self.local_proc (a popen instance) has terminated prior to the confirmation of startup. This prevents users from having to wait for the kernel timeout duration to know if the launch fails. It also helps distinguish local invocation issues from remote post-launch issues since the failure will be relatively immediate.

Note that this method only applies to those process proxy implementations that launch from the local node. Proxies like DistributedProcessProxy use rsh against a remote node, so there’s not local_proc in play to interrogate.

static get_current_time()#

Return the current time (in milliseconds) from epoch.

This method is intended for use in determining timeout values.

Return type:

int

async get_provisioner_info()#

Captures the base information necessary for persistence relative to this instance.

This enables applications that subclass KernelManager to persist a kernel provisioner’s relevant information to accomplish functionality like disaster recovery or high availability by calling this method via the kernel manager’s provisioner attribute.

NOTE: The superclass method must always be called first to ensure proper serialization.

Return type:

Dict[str, Any]

get_shutdown_wait_time(recommended=5.0)#

Returns the time allowed for a complete shutdown. This may vary by provisioner.

This method is called from KernelManager.finish_shutdown() during the graceful phase of its kernel shutdown sequence.

The recommended value will typically be what is configured in the kernel manager.

Return type:

float

static get_time_diff(start_time_ms)#

Return the difference (in seconds) between the given start_time and the current time

Return type:

float

async handle_launch_timeout()#

Checks to see if the kernel launch timeout has been exceeded while awaiting connection info.

abstract property has_process: bool#

Returns true if this provisioner is currently managing a process.

This property is asserted to be True immediately following a call to the provisioner’s launch_kernel() method.

static ip_is_local(ip)#

Returns True if ip is considered local to this server, False otherwise.

abstract async kill(restart=False)#

Kill the kernel process.

This is typically accomplished via a SIGKILL signal, which cannot be caught. This method is called from KernelManager.kill_kernel() when terminating a kernel immediately.

restart is True if this operation will precede a subsequent launch_kernel request.

Return type:

None

async launch_kernel(cmd, **kwargs)#

Launch the kernel process and return its connection information.

This method is called from KernelManager.launch_kernel() during the kernel manager’s start kernel sequence.

Return type:

Dict[str, Union[int, str, bytes]]

async load_provisioner_info(provisioner_info)#

Loads the base information necessary for persistence relative to this instance.

The inverse of get_provisioner_info(), this enables applications that subclass KernelManager to re-establish communication with a provisioner that is managing a (presumably) remote kernel from an entirely different process that the original provisioner.

NOTE: The superclass method must always be called first to ensure proper deserialization.

Return type:

None

log_and_raise(ex, chained=None)#

Helper method that logs the string-ized exception ‘ex’ and raises that exception.

If a chained exception is provided that exception will be in the raised exception’s from clause.

Return type:

None

:param exException

The exception to log and raise

:param chainedException (optional)

The exception to use in the ‘from’ clause.

abstract log_kernel_launch(cmd)#

Logs the kernel launch from the respective remote provisioner

Return type:

None

abstract async poll()#

Checks if kernel process is still running.

If running, None is returned, otherwise the process’s integer-valued exit code is returned. This method is called from KernelManager.is_alive().

Return type:

Optional[int]

async post_launch(**kwargs)#

Perform any steps following the kernel process launch.

This method is called from KernelManager.post_start_kernel() as part of its start kernel sequence.

Return type:

None

async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

Dict[str, Any]

async receive_connection_info()#

Monitors the response address for connection info sent by the remote kernel launcher.

Return type:

bool

async send_signal(signum)#

Sends signal identified by signum to the kernel process.

This method is called from KernelManager.signal_kernel() to send the kernel process a signal.

Return type:

None

async shutdown_listener(restart)#

Sends a shutdown request to the kernel launcher listener.

Return type:

None

async shutdown_requested(restart=False)#

Allows the provisioner to determine if the kernel’s shutdown has been requested.

This method is called from KernelManager.request_shutdown() as part of its shutdown sequence.

This method is optional and is primarily used in scenarios where the provisioner may need to perform other operations in preparation for a kernel’s shutdown.

Return type:

None

abstract async terminate(restart=False)#

Terminates the kernel process.

This is typically accomplished via a SIGTERM signal, which can be caught, allowing the kernel provisioner to perform possible cleanup of resources. This method is called indirectly from KernelManager.finish_shutdown() during a kernel’s graceful termination.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

async wait()#

Waits for kernel process to terminate.

This method is called from KernelManager.finish_shutdown() and KernelManager.kill_kernel() when terminating a kernel gracefully or immediately, respectively.

Return type:

Optional[int]

gateway_provisioners.remote_provisioner.gp_launch_kernel(cmd, **kwargs)#

Response manager used by remote provisioners.

class gateway_provisioners.response_manager.Response#

Bases: Event

Combines the event behavior with the kernel launch response.

property response#
class gateway_provisioners.response_manager.ResponseManager(**kwargs)#

Bases: SingletonConfigurable

Singleton that manages the responses from each kernel launcher at startup.

This singleton does the following: 1. Acquires a public and private RSA key pair at first use to encrypt and decrypt the received responses. The public key is sent to the launcher during startup and is used by the launcher to encrypt the AES key the launcher uses to encrypt the connection information, while the private key remains in the server and is used to decrypt the AES key from the response - which it then uses to decrypt the connection information. 2. Creates a single socket based on the configuration settings that is listened on via a periodic callback. 3. On receipt, it decrypts the response (key then connection info) and posts the response payload to a map identified by the kernel_id embedded in the response. 4. Provides a wait mechanism for callers to poll to get their connection info based on their registration (of kernel_id).

KEY_SIZE = 1024#
async get_connection_info(kernel_id)#

Performs a timeout wait on the event, returning the connection information on completion.

Return type:

dict

property public_key: str#

Provides the string-form of public key PEM with header/footer/newlines stripped.

register_event(kernel_id)#

Register kernel_id so its connection information can be processed.

Return type:

None

property response_address: str#
stop_response_manager()#

Stops the connection processor.

Return type:

None

A spark operator provisioner.

class gateway_provisioners.spark_operator.SparkOperatorProvisioner(**kwargs: Any)#

Bases: CustomResourceProvisioner

Spark operator provisioner.

object_kind = 'SparkApplication'#

Code related to managing kernels running in YARN clusters.

class gateway_provisioners.yarn.YarnProvisioner(**kwargs: Any)#

Bases: RemoteProvisionerBase

Kernel lifecycle management for YARN clusters.

alt_yarn_endpoint#

The http url specifying the alternate YARN Resource Manager. This value should be set when YARN Resource Managers are configured for high availability. Note: If both YARN endpoints are NOT set, the YARN library will use the files within the local HADOOP_CONFIG_DIR to determine the active resource manager. (GP_ALT_YARN_ENDPOINT env var)

alt_yarn_endpoint_env = 'GP_ALT_YARN_ENDPOINT'#
async cleanup(restart=False)#

Cleanup any resources allocated on behalf of the kernel provisioner.

This method is called from KernelManager.cleanup_resources() as part of its shutdown kernel sequence.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

async confirm_remote_startup()#

Confirms the remote process has started and returned necessary connection information.

Return type:

None

final_states = {'FAILED', 'FINISHED', 'KILLED'}#
async get_provisioner_info()#

Captures the base information necessary for persistence relative to this instance.

This enables applications that subclass KernelManager to persist a kernel provisioner’s relevant information to accomplish functionality like disaster recovery or high availability by calling this method via the kernel manager’s provisioner attribute.

NOTE: The superclass method must always be called first to ensure proper serialization.

Return type:

dict[str, Any]

get_shutdown_wait_time(recommended=5.0)#

Returns the time allowed for a complete shutdown. This may vary by provisioner.

This method is called from KernelManager.finish_shutdown() during the graceful phase of its kernel shutdown sequence.

The recommended value will typically be what is configured in the kernel manager.

Return type:

float

async handle_launch_timeout()#

Checks to see if the kernel launch timeout has been exceeded while awaiting connection info.

Note: This is a complete override of the superclass method.

Return type:

None

property has_process: bool#

Returns true if this provisioner is currently managing a process.

This property is asserted to be True immediately following a call to the provisioner’s launch_kernel() method.

impersonation_enabled#

Indicates whether impersonation will be performed during kernel launch. (GP_IMPERSONATION_ENABLED env var)

impersonation_enabled_env = 'GP_IMPERSONATION_ENABLED'#
initial_states = {'ACCEPTED', 'NEW', 'RUNNING', 'SUBMITTED'}#
async kill(restart=False)#

Kill the kernel process.

This is typically accomplished via a SIGKILL signal, which cannot be caught. This method is called from KernelManager.kill_kernel() when terminating a kernel immediately.

restart is True if this operation will precede a subsequent launch_kernel request.

Return type:

None

async load_provisioner_info(provisioner_info)#

Loads the base information necessary for persistence relative to this instance.

The inverse of get_provisioner_info(), this enables applications that subclass KernelManager to re-establish communication with a provisioner that is managing a (presumably) remote kernel from an entirely different process that the original provisioner.

NOTE: The superclass method must always be called first to ensure proper deserialization.

Return type:

None

log_kernel_launch(cmd)#

Logs the kernel launch from the respective remote provisioner

Return type:

None

async poll()#

Checks if kernel process is still running.

If running, None is returned, otherwise the process’s integer-valued exit code is returned. This method is called from KernelManager.is_alive().

Return type:

int | None

async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

dict[str, Any]

async send_signal(signum)#

Sends signal identified by signum to the kernel process.

This method is called from KernelManager.signal_kernel() to send the kernel process a signal.

Return type:

None

async terminate(restart=False)#

Terminates the kernel process.

This is typically accomplished via a SIGTERM signal, which can be caught, allowing the kernel provisioner to perform possible cleanup of resources. This method is called indirectly from KernelManager.finish_shutdown() during a kernel’s graceful termination.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

yarn_endpoint#

If this value is NOT set, the YARN library will use the files within the local HADOOP_CONFIG_DIR to determine the active resource manager. (GP_YARN_ENDPOINT env var)

Type:

The http url specifying the YARN Resource Manager. Note

yarn_endpoint_env = 'GP_YARN_ENDPOINT'#
yarn_endpoint_security_enabled#

Is YARN Kerberos/SPNEGO Security enabled (True/False). (GP_YARN_ENDPOINT_SECURITY_ENABLED env var)

yarn_endpoint_security_enabled_default_value = False#
yarn_endpoint_security_enabled_env = 'GP_YARN_ENDPOINT_SECURITY_ENABLED'#

Module contents#

class gateway_provisioners.RemoteProvisionerBase(**kwargs: Any)#

Bases: RemoteProvisionerConfigMixin, KernelProvisionerBase

Base class for remote provisioners.

async cleanup(restart=False)#

Cleanup any resources allocated on behalf of the kernel provisioner.

This method is called from KernelManager.cleanup_resources() as part of its shutdown kernel sequence.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

abstract async confirm_remote_startup()#

Confirms the remote process has started and returned necessary connection information.

detect_launch_failure()#

Helper method called from implementations of confirm_remote_startup() that checks if self.local_proc (a popen instance) has terminated prior to the confirmation of startup. This prevents users from having to wait for the kernel timeout duration to know if the launch fails. It also helps distinguish local invocation issues from remote post-launch issues since the failure will be relatively immediate.

Note that this method only applies to those process proxy implementations that launch from the local node. Proxies like DistributedProcessProxy use rsh against a remote node, so there’s not local_proc in play to interrogate.

static get_current_time()#

Return the current time (in milliseconds) from epoch.

This method is intended for use in determining timeout values.

Return type:

int

async get_provisioner_info()#

Captures the base information necessary for persistence relative to this instance.

This enables applications that subclass KernelManager to persist a kernel provisioner’s relevant information to accomplish functionality like disaster recovery or high availability by calling this method via the kernel manager’s provisioner attribute.

NOTE: The superclass method must always be called first to ensure proper serialization.

Return type:

Dict[str, Any]

get_shutdown_wait_time(recommended=5.0)#

Returns the time allowed for a complete shutdown. This may vary by provisioner.

This method is called from KernelManager.finish_shutdown() during the graceful phase of its kernel shutdown sequence.

The recommended value will typically be what is configured in the kernel manager.

Return type:

float

static get_time_diff(start_time_ms)#

Return the difference (in seconds) between the given start_time and the current time

Return type:

float

async handle_launch_timeout()#

Checks to see if the kernel launch timeout has been exceeded while awaiting connection info.

abstract property has_process: bool#

Returns true if this provisioner is currently managing a process.

This property is asserted to be True immediately following a call to the provisioner’s launch_kernel() method.

static ip_is_local(ip)#

Returns True if ip is considered local to this server, False otherwise.

abstract async kill(restart=False)#

Kill the kernel process.

This is typically accomplished via a SIGKILL signal, which cannot be caught. This method is called from KernelManager.kill_kernel() when terminating a kernel immediately.

restart is True if this operation will precede a subsequent launch_kernel request.

Return type:

None

async launch_kernel(cmd, **kwargs)#

Launch the kernel process and return its connection information.

This method is called from KernelManager.launch_kernel() during the kernel manager’s start kernel sequence.

Return type:

Dict[str, Union[int, str, bytes]]

async load_provisioner_info(provisioner_info)#

Loads the base information necessary for persistence relative to this instance.

The inverse of get_provisioner_info(), this enables applications that subclass KernelManager to re-establish communication with a provisioner that is managing a (presumably) remote kernel from an entirely different process that the original provisioner.

NOTE: The superclass method must always be called first to ensure proper deserialization.

Return type:

None

log_and_raise(ex, chained=None)#

Helper method that logs the string-ized exception ‘ex’ and raises that exception.

If a chained exception is provided that exception will be in the raised exception’s from clause.

Return type:

None

:param exException

The exception to log and raise

:param chainedException (optional)

The exception to use in the ‘from’ clause.

abstract log_kernel_launch(cmd)#

Logs the kernel launch from the respective remote provisioner

Return type:

None

abstract async poll()#

Checks if kernel process is still running.

If running, None is returned, otherwise the process’s integer-valued exit code is returned. This method is called from KernelManager.is_alive().

Return type:

Optional[int]

async post_launch(**kwargs)#

Perform any steps following the kernel process launch.

This method is called from KernelManager.post_start_kernel() as part of its start kernel sequence.

Return type:

None

async pre_launch(**kwargs)#

Perform any steps in preparation for kernel process launch.

This includes applying additional substitutions to the kernel launch command and environment. It also includes preparation of launch parameters.

NOTE: Subclass implementations are advised to call this method as it applies environment variable substitutions from the local environment and calls the provisioner’s _finalize_env() method to allow each provisioner the ability to cleanup the environment variables that will be used by the kernel.

This method is called from KernelManager.pre_start_kernel() as part of its start kernel sequence.

Returns the (potentially updated) keyword arguments that are passed to launch_kernel().

Return type:

Dict[str, Any]

async receive_connection_info()#

Monitors the response address for connection info sent by the remote kernel launcher.

Return type:

bool

async send_signal(signum)#

Sends signal identified by signum to the kernel process.

This method is called from KernelManager.signal_kernel() to send the kernel process a signal.

Return type:

None

async shutdown_listener(restart)#

Sends a shutdown request to the kernel launcher listener.

Return type:

None

async shutdown_requested(restart=False)#

Allows the provisioner to determine if the kernel’s shutdown has been requested.

This method is called from KernelManager.request_shutdown() as part of its shutdown sequence.

This method is optional and is primarily used in scenarios where the provisioner may need to perform other operations in preparation for a kernel’s shutdown.

Return type:

None

abstract async terminate(restart=False)#

Terminates the kernel process.

This is typically accomplished via a SIGTERM signal, which can be caught, allowing the kernel provisioner to perform possible cleanup of resources. This method is called indirectly from KernelManager.finish_shutdown() during a kernel’s graceful termination.

restart is True if this operation precedes a start launch_kernel request.

Return type:

None

async wait()#

Waits for kernel process to terminate.

This method is called from KernelManager.finish_shutdown() and KernelManager.kill_kernel() when terminating a kernel gracefully or immediately, respectively.

Return type:

Optional[int]