The build system maintains an in-memory representation of the file system that the tasks should use. The tasks should compute their outputs and use this file representation hierarchy to store the file outputs. The purpose of this representation is to increase performance by reducing I/O load, and appropriate caching (other optimizations may apply as well).
During a lifetime of a file it may have only one parent. In its initial state it is constructed without any parent
directory. It can be attached to a parent via SakerDirectory.add(
The contents of files might be stored anywhere. In memory, on disk, over the network, etc...
It is important that each file has a location where it resides in the build system. This location is specified by
getSakerPath().
This location can be mapped to a file system storage location where it should actually reside if the in-memory
representation didn't exist. As the outputs of tasks are collected in the in-memory hierarchy, it is required to
synchronize the files to their corresponding location. This process is called synchronization.
During synchronization the contents of the file will be persisted to the appropriate location in the file system. The target location is specified by the execution configuration and the path of the file. The synchronization process includes checking if the current file residing at the location has the same contents as the in-memory file. If they match, then no I/O will be done. If the contents differ then the contents of this file will be written out to the target location. Synchronizing multiple times while the contents of the disk is unchanged by external agents, should be a no-op.
During synchronization, the ContentDescriptor of the file is used to determine if the file has changed.
This interface modifies the behaviour of the content methods specified by FileHandle. Calling these methods may implicitly synchronize the contents of the file, and will retrieve the contents in the most efficient way.
Implicit synchronization can be avoided by the caller by invoking the introduced content retrieval methods with the
Impl
suffix.
Implicit synchronization can be also avoided by the subclass by overriding getEfficientOpeningMethods()
method.
Implicit synchronization does not occur for files that aren't attached to a parent. I.e. if getSakerPath()
returns a relative path at the time of calling when retrieving the content, the implicit synchronization is not
employed.
The base directories for working with files are available from the exection and task contexts.
Instances of this interface can be checked if they represent a directory by using the
(file instanceof SakerDirectory)
expression on them.
When designing tasks for remote execution, and accessing contents of a file it is strongly recommended to use the
utility methods in TaskExecutionUtilities, instead of calling it directly on this interface. Doing so can
result in increased performance, as the build runtime can employ caching and improve overall network performance.
It is recommended to make files RMI-transferrable when executing remote tasks, by overriding
getRemoteExecutionRMIWrapper(). Implementations should not directly declare RMI transfer properties for the
classes themselves. (I.e. do not RMI annotate the class, but only override getRemoteExecutionRMIWrapper())
Clients must not directly implement this interface, but extend the SakerFileBase abstract class. Implementing this interface directly can and will result in runtime errors.
public static final int | OPENING_METHODS_ALL = 15 Flag for getEfficientOpeningMethods() for signalling that all opening methods are efficient. |
public static final int | Flag for getEfficientOpeningMethods() for signalling that no opening methods are efficient. |
public static final int | Flag for getEfficientOpeningMethods() for signaling that getting raw byte contents is efficient. |
public static final int | Flag for getEfficientOpeningMethods() for signaling that getting string contents is efficient. |
public static final int | Flag for getEfficientOpeningMethods() for signaling that opening streams are efficient. |
public static final int | Flag for getEfficientOpeningMethods() for signaling that the writing to streams are efficient. |
public default ByteArrayRegion | getBytes() Gets the raw contents of the file as a byte array.
|
public default ByteArrayRegion | Gets the raw byte contents of this file without implicit synchronization. |
public default String | Gets the contents of the file as a String.
|
public ContentDescriptor | Gets the content descriptor of this file. |
public default String | Gets the string contents of this file without implicit synchronization. |
public default int | Gets the efficient opening methods flag of this file. |
public SakerDirectory | Gets the parent of this file. |
public default Set< | Gets the posix file permissions that are associated with this file. |
public default Class< | Gets the RMIWrapper class ot use when transferring instances during remote execution. |
public SakerPath | Gets the path of this file. |
public default ByteSource | Opens a ByteSource to the contents of the file.
|
public default ByteSource | Opens a byte source to the contents of this file without implicit synchronization. |
public default InputStream | Opens an InputStream to the contents of the file.
|
public default InputStream | Opens an input stream to the contents of this file without implicit synchronization. |
public void | remove() Removes this file from its parent. |
public void | Synchronizes the contents of this file with the file system. |
public void | synchronize( Synchronizes the contents of this file to the target file system location. |
public void | synchronizeImpl( Synchronizing implementation for persisting the contents of this file to the target file system location. |
public boolean | synchronizeImpl( Overloaded synchronizing method with additional output stream to write the contents to. |
public void | writeTo( Writes the contents to the parameter stream.
|
public default void | Writes the contents to the parameter stream.
|
public void | Writes the contents of this file without implicit synchronization. |
From: FileHandle |
Using this flag will result in no implicit synchronization.
This method implicitly synchronizes the contents of the file, unless getEfficientOpeningMethods() reports otherwise.
It is recommended to use TaskExecutionUtilities.getBytes(
See getBytes().
The default implementation converts the raw byte contents of the file is to string by decoding it as UTF-8 encoded data.
This method implicitly synchronizes the contents of the file, unless getEfficientOpeningMethods() reports otherwise.
It is recommended to use TaskExecutionUtilities.getContent(
See ContentDescriptor. Content descriptors are used to determine if the file contents need to be persisted to the file system.
Subclasses should note that if they use posix file permissions then they should return a content descriptor that reflects this behaviour. They are recommended to use PosixFilePermissionsDelegateContentDescriptor to construct the actual content descriptor.
null
.See getContent().
An opening method is considered (performance-wise) efficient if it generally takes less resources (time and memory) to call the appropriate content method instead of trying to employ caching to the disk.
If an opening method is reported as efficient, then the implicit synchronizations will not take place specified by the documentation of SakerFile interface.
If an opening method is not reported as efficient, then calling content retrieval methods which do not end
with Impl
will check if the file system already has the contents of this file persisted, and will
read the contents from there if it has. If not, then the contents will be synchronized with to the disk, and the
contents will be retrieved in the most efficient manner. (This manner depends on the nature of the opening
method.)
The default implementation returns OPENING_METHODS_NONE.
The parent of a file can change during the lifetime of an object.
This method returns the set of posix file permissions that were associated with the given file as part of the build execution.
Subclasses can override this method to explicitly set the permissions, however, if they do so then they also need to return a content descriptor that reflects this association. They are encouraged to use PosixFilePermissionsDelegateContentDescriptor to construct the actual content descriptor if posix file permissions are set for a file.
Subclasses that return non-null
from this method are also required to set these permissions in the
synchronization methods.
Note that this method will return null
by default even when the build is running on file
systems that support posix file permissions. The posix file permissions are not automatically queried by the
build system but only if a build execution previously set it.
null
if none are associated.When designing tasks for remote execution, it is important to keep in mind that the in-memory file hierarchy only exists on the coordinator machine. In order to handle files from a remote cluster endpoint it is necessary to transfer files between computers. By default files are transferred based on the called method.
In order to reduce network calls it is recommended to override this method to customize how the files will be transferred over the network.
This method is used during an RMI call if the object is specified to be transferred using RemoteExecutionSakerFileRMIWrapper.
null
if this is not supported.The path of this file is based on its parent path, appended with its name. If the file has no parent, then a relative path will be returned with a single path name which is the name of this file.
The path identifies the synchronization location of this file based on the current path configuration.
This method implicitly synchronizes the contents of the file, unless getEfficientOpeningMethods() reports otherwise.
It is recommended to use TaskExecutionUtilities.openByteSource(
See openByteSource().
If subclasses override this method, they must override openInputStreamImpl() as well. (Simply returning
ByteSource.toInputStream(openByteSourceImpl())
is fine.)
This method implicitly synchronizes the contents of the file, unless getEfficientOpeningMethods() reports otherwise.
It is recommended to use TaskExecutionUtilities.openInputStream(
See openInputStream().
Calling this is a no-op if the file has no current parent.
The synchronization algorithm is described in the documentation of SakerFile interface.
The synchronization algorithm is described in the documentation of SakerFile interface.
File implementations generally should not override this method, but
synchronizeImpl(
null
.This method will not check if the contents of the disk have been changes in relation to this file, but will always persist it to the given location.
Subclasses should implement this method to persist its contents to the location specified by the parameter.
If subclasses override this method, they must override synchronizeImpl(false
.)
Subclasses should set any posix file permissions there are associated with the file during synchronization.
null
.This method will not check if the contents of the disk have been changes in relation to this file, but will always persist it to the given location.
This method exists for performance optimization. Subclasses should override this method and attempt to concurrently persist the contents of the file to the target location and write the contents to the additional stream.
Subclasses must not throw an IOException if the writing to the additional stream failed, but rethrow them as a SecondaryStreamException. If a SecondaryStreamException is thrown, the synchronization is going to be considered as successful, and only the writing to the secondary stream is considered as failure. If an IOException is thrown, both stream writings is considered to be failed.
The implementations are not required to handle concurrent writing. This method should return true
if
it was able to concurrently synchronize and write the contents to the additional stream. The
PriorityMultiplexOutputStream utility class can help implementations in implementing this functionality.
Overriding this method will improve the overall synchronization performance.
If subclasses override this method, they must override synchronizeImpl(
Subclasses should set any posix file permissions there are associated with the file during synchronization.
true
if the contents of the file was successfully written to the additional stream.null
.The method implementations mustn't close the argument output.
This method implicitly synchronizes the contents of the file, unless getEfficientOpeningMethods() reports otherwise.
It is recommended to use TaskExecutionUtilities.writeTo(
null
.The method implementations mustn't close the argument output.
This method implicitly synchronizes the contents of the file, unless getEfficientOpeningMethods() reports otherwise.
It is recommended to use TaskExecutionUtilities.writeTo(
null
.
To call this method using a ByteSink, use ByteSink.toOutputStream(
Implementations can use ByteSink.valueOf(
null
.