We still want to split a table into 12 chunks with equal numbers of blocks, so we can access each chunk using a rowid range. We have the extents we need and now we need to calculate the exact chunk boundaries.

Current list of posts:

- Chunking tables 1: Genesis
- Chunking tables 2: Requirement
- Chunking tables 3: working with blocks
- Chunking tables 4: Histograms
- Chunking tables 5: chunk boundaries
- Chunking tables 6: JOIN solution
- Chunking tables 7: prior sys_guid() ???
- Chunking tables 8: second solution

#### Histograms and WIDTH_BUCKET

I explained in my previous post that `WIDTH_BUCKET`

is based on a histogram having identical intervals in a range.

- The minimum range value is included in the range but the maximum range value is not.
- The same thing is true for each interval: the minimum value is included in the interval, but not the maximum value.

We have already used `WIDTH_BUCKET`

to answer the question: “what chunk is this block in?” Now we are asking the opposite question: “what blocks start and end this chunk?” We need to make absolutely sure the answers are coherent!

#### Getting the upper chunk boundary

Using our test table from the previous post, let’s compare what `WIDTH_BUCKET`

tells us with a wrong answer and a right answer. Then I’ll explain in detail. Right now, I’m only interested in those blocks that end a chunk.

select this_block_chunk, block last_block, round(256 * (this_block_chunk / 12), 1) max_chunk_val, floor(256 * (this_block_chunk / 12)) floor_val, ceil (256 * (this_block_chunk / 12) - 1) right_val from ( select level-1 block, width_bucket(level-1,0,256,12) this_block_chunk, width_bucket(level,0,256,12) next_block_chunk from dual connect by level <= 256 ) where this_block_chunk < next_block_chunk;

THIS_BLOCK_CHUNK | LAST_BLOCK | MAX_CHUNK_VAL | FLOOR_VAL | RIGHT_VAL |
---|---|---|---|---|

1 | 21 | 21.3 | 21 | 21 |

2 | 42 | 42.7 | 42 | 42 |

3 | 63 | 64 | 64 | 63 |

4 | 85 | 85.3 | 85 | 85 |

5 | 106 | 106.7 | 106 | 106 |

6 | 127 | 128 | 128 | 127 |

7 | 149 | 149.3 | 149 | 149 |

8 | 170 | 170.7 | 170 | 170 |

9 | 191 | 192 | 192 | 191 |

10 | 213 | 213.3 | 213 | 213 |

11 | 234 | 234.7 | 234 | 234 |

12 | 255 | 256 | 256 | 255 |

For each block, I asked `WIDTH_BUCKET`

what chunk it is in and what chunk the next block is in, and I kept only those blocks where the next block is in the next chunk. That means I have a list of the last blocks for each chunk.

In the third column, I calculate the maximum value for each chunk. Notice it is not always an integer value. How can we calculate the greatest block number that is *less than* that maximum value?

- Using
`CEIL()`

or`ROUND()`

, we will sometimes get a result that is greater than the maximum value. - Using
`FLOOR()`

, we will sometimes get a result that is equal to the maximum value – but we need to be less than, not equal to! You can see the problem in chunks 3, 6, 9 and 12. - The right answer is to
**subtract 1 from the maximum value and then use**. That way the result will always be less than the maximum value, but as close to it as possible.`CEIL()`

#### Getting the lower chunk boundary

I’ll bet you thought of this yourself. The lower boundary of a chunk is one greater that the upper boundary of the previous chunk :-)

#### Coming next…

We have the extents we need, and we have a method for calculating all the chunk boundaries. In the next post, I’ll present the first of two complete solutions for chunking a table.

Current list of posts:

- Chunking tables 1: Genesis
- Chunking tables 2: Requirement
- Chunking tables 3: working with blocks
- Chunking tables 4: Histograms
- Chunking tables 5: chunk boundaries
- Chunking tables 6: JOIN solution
- Chunking tables 7: prior sys_guid() ???
- Chunking tables 8: second solution