saker.build Documentation TaskDoc JavaDoc Packages
public interface TaskFactory<R>
Represents a stateless factory for tasks which are the basic execution units for the build system.

Task factories represent a task that is to be executed by the build runtime. They are a specification of how the task should be executed, and what the task itself actually does.

Taks factories must implement the equals(Object) and hashCode() contract, as not doing so will result in incorrect incremental builds. By checking the equality of two task factories, if two of them equals, that means that they will execute exactly the same computations given the same circumstances.

Task factories should not have any state, and all of the data contained in them should be immutable.

Task factories can specify capabilities which are hints for the build runtime to fine-grain the execution behaviour. They can be used to signal that a task completes quickly, is remote executeble, etc... Additional capabilities can be added to this interface in the future.

The task factories can also be used to start inner tasks. Inner tasks are handled differently by the build system, see TaskContext.startInnerTask(TaskFactory<R>, InnerTaskExecutionParameters) for more information.

Task factories can implement a strategy to choose a suitable build environment to execute on. This is mostly relevant when designing tasks for remote execution. See getExecutionEnvironmentSelector().

In order to avoid thrashing the system due to too high level of concurrency, tasks can report the level of computation they will use to do their work. See getRequestedComputationTokenCount().

Task factories are strongly encouraged to implement the Externalizable interface for faster serialization.

RThe return type of the task.
Fields
public static final String
CAPABILITY_CACHEABLE = "saker.task.remote.cacheable"
public static final String
CAPABILITY_INNER_TASKS_COMPUTATIONAL = "saker.task.inner.tasks.computational"
public static final String
CAPABILITY_REMOTE_DISPATCHABLE = "saker.task.remote.dispatchable"
public static final String
CAPABILITY_SHORT_TASK = "saker.task.short"
Methods
public Task<extends R>
createTask(ExecutionContext executioncontext)
Creates a task instance.
public boolean
Checks if this task equals to the argument in a sense that it will execute exactly the same computations given the same circumstances.
public default Set<String>
public default TaskExecutionEnvironmentSelector
public default TaskInvocationConfiguration
Gets the invocation configuration for this build task.
public default int
public int
Returns a hash code value for the object.
public static final String CAPABILITY_CACHEABLE = "saker.task.remote.cacheable"
Capability string for specify that the results of the task can be cached and retrieved.

Cacheable tasks allow the build system to retrieve the result of the execution from external sources, or publish their results to a database.

Cacheable tasks are used with build caches. Build caches are background daemon processes which provide access to results of previously run tasks. If a task reports themself as cacheable, the build system may try to retrieve its previously run result from a build cache configured to the current execution. After a cacheable task executes, the build system may publish the results of the task to the configured build cache, so the outputs will be available for future reuse.

This capability serves as a hint, and the build system may decide that it won't use the build cache to retrieve the results. This may be due to performance, configuration, build environment or other arbitrary reasons.

The build system will only retrieve the results for a task if the published task is applicable to the current build environment. Meaning, that if any dependendencies of the published task have been changed in the current run, then it won't be reused.

Cacheable tasks are strongly recommended to comply with the following restrictions:

  • The task identifier for the task should have a stable hash code. This means that the task identifier should return the same hash code for the same objects between different executions of the Java process. This usually requires that the task identifier doesn't derive its hash code from the identity hash code, class hash code, or in any way runtime dependent values. With that in mind, enums cannot be used as task identifiers, because their hash code is not stable.
  • The task cannot wait on the result of another task, but it can only retrieve its finished results. This means that the task may only use the finished result retrieval methods of other tasks. This requirement is aligned with the computation token usage.
The above restrictions are not hard restrictions, meaning that in case of their violation, the build runtime will not throw an exception, but just ignore the task instance for possible build cache usage.

The above restrictions are required in order to provide an efficient and sane implementation for the build system, and may be lifted in the future, but task implementations should align their behaviour with these in place nonetheless.

As a general rule of thumb, only tasks should report this capability which do more work than the time it takes to retrieve their results from a network cache. That is, the time the task computation takes should outweight the network communication times.

public static final String CAPABILITY_INNER_TASKS_COMPUTATIONAL = "saker.task.inner.tasks.computational"
Capability string for specifying a task that will start inner task with computation tokens.

If a task wishes to start inner tasks that report 1 or more computation tokens, then the enclosing task must report this capability. This is in order to ensure that the proper restrictions are placed in the build system for the enclosing and inner tasks as well. See getRequestedComputationTokenCount() for the nature of restrictions.

public static final String CAPABILITY_REMOTE_DISPATCHABLE = "saker.task.remote.dispatchable"
Capability string for specify a task that can be executed remotely on build clusters.

Remote dispatchable tasks can be transferred to remote executor instances, therefore improving the number of concurrently executing tasks and ensuring horizontal scalability.

This capability only used when the user configures the build execution to use at least one cluster instance.

When specifiying this capability, the task will be a candidate for remote dispatching. The build runtime is not required to actually execute this task on a remote machine, but it will make efforts to property distribute it based on current workloads.

When a task reports themselves as remote dispatchable, a restriction is placed on them that they cannot wait for other tasks. This restriction is necessary, as the deadlock detection is only feasible on the main executor machine. (Note that this restriction is usually non-distruptive. As generally remote dispatchable tasks are used for heavily computational workload, they usually report computation tokens to signal the amount of work done. In that case, they already can't wait for other tasks. This restriction may be lifted in the future, or may be only employed if the task is actually being run on a cluster.)
Tasks can retrieve finished results nonetheless.

Designing a task to be remote dispatchable can improve performance, as it will result in more utilization of overall resources available to the build system. Remote dispatchable tasks should be carefully implemented, and use the appropriate functions for avoiding performance traps. See the remote execution guide of the build system for best practices.

Good example for a remote executable task is C++ compilation, where source files can be transferred to clusters, compiled, and the result returned back to the main executor. For a large set of files, the compilation tasks can be distributed to multiple machines, and the overall compilation can complete much faster than if only a single machine was used.

To choose an appropriate build environment for the task, getExecutionEnvironmentSelector() can be used.

public static final String CAPABILITY_SHORT_TASK = "saker.task.short"
Capability string for specify a task that is considered to be short.

If a task reports themselves as short then they are considered to be fast to execute. This is in a sense that the execution of the task is shorter than creating a separate thread and running them concurrently. As a general rule of thumb, if the execution time of a task is comparable to the time that a thread takes to start, then it should be short.

It is recommended that tasks which wait for no other tasks, have no dependencies, do no heavy computations, and do no I/O operations, are good subjects to be short.

The following additional restrictions apply to short tasks:

  • They can only wait for tasks which are also short capable.
  • They cannot wait for tasks which are not yet started.
  • They cannot be remote dispatchable.
  • They cannot report computation tokens.

The build system can run short tasks without creating a separate thread for them. This means that starting a short task will not return control to the starter, but wait for the execution of the task and then return control. This is an optimization can reduce unnecessary load on the OS and the build system.

public abstract Task<extends R> createTask(ExecutionContext executioncontext)
Creates a task instance.

Every task instance is used for only one invocation.

executioncontextThe execution context that is used to run the task.
The created task.
public abstract boolean equals(Object obj)
Checks if this task equals to the argument in a sense that it will execute exactly the same computations given the same circumstances.

The checks for equality should also take the execution environment selector into account.

Indicates whether some other object is "equal to" this one.

The equals method implements an equivalence relation on non-null object references:

  • It is reflexive: for any non-null reference value x, x.equals(x) should return true.
  • It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
  • It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  • It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
  • For any non-null reference value x, x.equals(null) should return false.

The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true).

Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

true if this object is the same as the obj argument; false otherwise.
Gets the capabilities of this task.

Unrecognized capabilities will be silently ignored by the build system.

An unmodifiable set of capability strings.
Gets an environment selector to determine if the task can execute in a given build environment.

If two task factories equal, then their returned environment selectors should equal as well.

If an environment selector fails to find a suitable environment, then an exception instance of TaskEnvironmentSelectionFailedException will be thrown by the build system and the build execution will abort.

The default implementation returns a selector which enables the task to use any build environment.

The environment selector.
Gets the invocation configuration for this build task.

The invocation configuration defines the nature of how the task executor should run the build task. See the properties of TaskInvocationConfiguration get familiar with possible configurations.

Use TaskInvocationConfiguration.builder() to create a new instance.

The default implementation constructs a configuration based on the deprecated methods getCapabilities(), getExecutionEnvironmentSelector(), and getRequestedComputationTokenCount().

The task invocation configuration
saker.build 0.8.12
Gets the computation token count consumed by this task during execution.

Computation tokens are used to prevent thrashing of the execution machine when too many concurrent operations are running. A computation token represents one unit of computational operation that uses one CPU thread on 100%. This method returns the average number of computation tokens the task uses during its execution. The task will start to run when the requested number of tokens are available for it.

If a task returns > 0 amount of computation tokens then a restriction is placed on them that they can't wait for other tasks in the build system. This is in order to prevent involuntarily deadlocking the execution.

(Reasoning: Tasks will not start execution until they can allocate the required amount of computation tokens for themselves. If a tasks attempts to wait for a task which cannot start due to not being able to allocate enough computation tokens will deadlock the build execution, although they could probably finish if computation tokens didn't exist. Implementing active deadlock detection for this behaviour is not deemed to be feasible, so the above restriction is placed on tasks which require computation tokens.)

If your task really needs to wait for an input task then we recommend waiting for them in a parent task and start the actual computation in a sub-task with computation tokens. Dependencies on input tasks can be specified by using the finished retrieval methods of the task futures which do not require waiting for the subject task.

The default implementation returns 0, meaning no computation tokens requested.

1 or more to specify how many computation tokens the execution of task requires.
public abstract int hashCode()
Overridden from: Object
Returns a hash code value for the object. This method is supported for the benefit of hash tables such as those provided by HashMap.

The general contract of hashCode is:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the Object.equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java™ programming language.)

a hash code value for this object.