Utility Methods¶
Utility methods provide specialized functionality for tasks like sampling, combining data, and monitoring pipelines.
reservoir()¶
Performs reservoir sampling to select a random subset of elements from a pipeline. This is highly memory-efficient, as it does not require loading the entire dataset into memory.
Signature: reservoir(int $size, ?callable $weightFunc = null): array
$size: The number of elements to sample.$weightFunc: An optional function to calculate the weight of each element for weighted sampling.
Behavior:
- This is a terminal operation.
- It uses Algorithm R for uniform sampling and Algorithm A-Chao for weighted sampling.
Examples:
// Get 10 random lines from a large file
$sample = take(new SplFileObject('large.log'))
->reservoir(10);
// Weighted sampling
$sample = take($items)
->reservoir(5, fn($item) => $item['priority']);
zip()¶
Combines multiple iterables into a single pipeline of tuples.
Signature: zip(iterable ...$inputs): self
...$inputs: The iterables to combine.
Behavior:
- Creates a new pipeline where each element is an array of corresponding elements from the input iterables.
- Shorter iterables are padded with
null.
Examples:
$result = take(['a', 'b'])
->zip([1, 2], [true, false])
->toList();
// [['a', 1, true], ['b', 2, false]]
runningCount()¶
Counts elements as they pass through the pipeline without consuming it.
Signature: runningCount(?int &$count): self
&$count: A reference to a counter variable, which will be updated.
Behavior:
- This is a non-terminal operation.
- It is useful for monitoring the number of elements that have been processed at a certain point in the pipeline.
Examples:
$count = 0;
$result = take(range(1, 100))
->filter(fn($x) => $x % 2 === 0)
->runningCount($count)
->toList();
echo "Count: $count"; // 50
stream()¶
Converts an array-based pipeline into a generator-based one, forcing all subsequent operations to be lazy and process elements one by one.
Signature: stream(): self
Behavior:
- This is a non-terminal operation.
- It is crucial for memory efficiency when working with large arrays.
Examples:
// Process a large array with low memory usage
$result = take($largeArray)
->stream()
->map(fn($x) => expensive_operation($x))
->toList();