max-core -
The max-core parameter is found in the SORT, JOIN, and ROLLUP components, among others.
max-core Value -
There is no single optimal value for the max-core parameter. A “good” value depends on your particular graph and the environment in which it runs, as well as on the data.
Details -
A component’s max-core parameter determines the maximum amount of memory the component will consume per partition before it spills to disk.
When the value of max-core is exceeded, all input (in the case of SORT) or the excess input (in the cases of the other components) is dropped to disk in the form of temporary files. This can have a dramatic impact on performance, but it does not mean that it is always better to increase the value of max-core in these situations.
The higher you set the value of max-core, the more memory the component can use. Using more memory generally improves performance — up to a point.
Beyond this point, performance will not improve and may even worsen.
If the value of max-core is set too high, operating system swapping can occur; if virtual memory on the machine is exhausted, the graph can fail.
When setting the value for max-core, we can use the suffixes k, m, and g (uppercase is also supported) to indicate powers of 1024.
For max-core, the suffix k (kilobytes) means precisely 1024 bytes. Similarly, the suffix m (megabytes) means precisely 1048576 (1024 power 2),
and g (gigabytes) means precisely 1024 power 3.
In general, using additional memory can improve the performance of in-memory ROLLUP or JOIN, but not of SORT.
When spillage occurs, consider setting the configuration variable AB_SPILL_FILE_COMPRESSION_LEVEL.
This variable compresses the temporary files spilled to the disk. It is most helpful when you have a fast CPU but a slow disk (which is common).
SORT component -
For the SORT component, 96 MB (100663296 bytes) is the default value for max-core.
No comments:
Post a Comment