The following are the ways to improve the performance of a graph :
-Try to use partitioning in the graph
-try minimizing the number of components
- Maintain lookups for better efficiency
-Components like join/ rollup should have the option. Input must be sorted if they are placed after a sort component
- If a component has In memory: Input need not be sorted option selected, use the MAX_CORE parameter value efficiently.
- Ensure that all the graphs where RDBMS tables are used as input, the join condition is on indexed columns.
- Make sure that a limited number of components are used in a particular phase
- Implement the usage of optimum value of max core values for the purpose of sorting and joining components.
- Utilize the minimum number of sort components.
- Utilize the minimum number of sorted join components and replace them with in-memory join/hash join, if needed and possible.
- Restrict only the needed fields in sort, reformat, join components
- Utilize phasing or flow buffers when merged or sorted joins
- Use sorted join, when two inputs are huge, otherwise use hash join
No comments:
Post a Comment