The Challenge
"Break a batch of invoices into multiple parts based on rules"
The Context
A module of an application generates invoices for services provided to patients. The information is sent to the insurance company.
Sample Data
InvId | BatchNumber | InvoiceNumber | VisitDate | Amount |
1 | 10000001 | 20001 | 2009-01-01 | 50.00 |
2 | 10000001 | 20001 | 2009-01-02 | 50.00 |
3 | 10000001 | 20001 | 2009-01-03 | 50.00 |
4 | 10000001 | 20001 | 2009-01-04 | 50.00 |
5 | 10000001 | 20002 | 2009-01-01 | 50.00 |
6 | 10000001 | 20002 | 2009-01-02 | 50.00 |
7 | 10000001 | 20002 | 2009-01-03 | 50.00 |
8 | 10000001 | 20003 | 2009-01-01 | 50.00 |
9 | 10000001 | 20003 | 2009-01-02 | 50.00 |
10 | 10000001 | 20004 | 2009-01-01 | 50.00 |
11 | 10000001 | 20004 | 2009-01-02 | 50.00 |
12 | 10000001 | 20004 | 2009-01-03 | 50.00 |
13 | 10000001 | 20004 | 2009-01-04 | 50.00 |
14 | 10000001 | 20004 | 2009-01-05 | 50.00 |
15 | 10000001 | 20004 | 2009-01-06 | 50.00 |
16 | 10000001 | 20005 | 2009-01-01 | 50.00 |
17 | 10000001 | 20005 | 2009-01-02 | 50.00 |
18 | 10000001 | 20005 | 2009-01-03 | 50.00 |
19 | 10000001 | 20005 | 2009-01-04 | 50.00 |
20 | 10000001 | 20005 | 2009-01-05 | 50.00 |
21 | 10000001 | 20005 | 2009-01-06 | 50.00 |
22 | 10000001 | 20005 | 2009-01-07 | 50.00 |
23 | 10000001 | 20005 | 2009-01-08 | 50.00 |
24 | 10000001 | 20006 | 2009-01-01 | 50.00 |
25 | 10000001 | 20007 | 2009-01-01 | 50.00 |
26 | 10000001 | 20007 | 2009-01-02 | 50.00 |
27 | 10000001 | 20007 | 2009-01-03 | 50.00 |
28 | 10000001 | 20008 | 2009-01-01 | 50.00 |
29 | 10000001 | 20008 | 2009-01-02 | 50.00 |
30 | 10000001 | 20008 | 2009-01-03 | 50.00 |
The insurance company needs an electronic file in a certain format (the format is irrelevant for us). A separate file needs to be generated for each batch. So based on the sample data we need to generate two files (one for each batch).
Each batch should contain not more than 10 rows. The maximum number of lines we can put in a file is 10. So we need to break each batch into parts that contains not more than 10 rows. When breaking the batches, there is one more rule to follow. While breaking the batches, we should take care that we do not break invoices.
Result should like look:
InvId | BatchNumber | InvoiceNumber | VisitDate | Amount | Set |
1 | 10000001 | 20001 | 2009-01-01 | 50.00 | 1 |
2 | 10000001 | 20001 | 2009-01-02 | 50.00 | 1 |
3 | 10000001 | 20001 | 2009-01-03 | 50.00 | 1 |
4 | 10000001 | 20001 | 2009-01-04 | 50.00 | 1 |
5 | 10000001 | 20002 | 2009-01-01 | 50.00 | 1 |
6 | 10000001 | 20002 | 2009-01-02 | 50.00 | 1 |
7 | 10000001 | 20002 | 2009-01-03 | 50.00 | 1 |
8 | 10000001 | 20003 | 2009-01-01 | 50.00 | 1 |
9 | 10000001 | 20003 | 2009-01-02 | 50.00 | 1 |
10 | 10000001 | 20004 | 2009-01-01 | 50.00 | 2 |
11 | 10000001 | 20004 | 2009-01-02 | 50.00 | 2 |
12 | 10000001 | 20004 | 2009-01-03 | 50.00 | 2 |
13 | 10000001 | 20004 | 2009-01-04 | 50.00 | 2 |
14 | 10000001 | 20004 | 2009-01-05 | 50.00 | 2 |
15 | 10000001 | 20004 | 2009-01-06 | 50.00 | 2 |
16 | 10000001 | 20005 | 2009-01-01 | 50.00 | 3 |
17 | 10000001 | 20005 | 2009-01-02 | 50.00 | 3 |
18 | 10000001 | 20005 | 2009-01-03 | 50.00 | 3 |
19 | 10000001 | 20005 | 2009-01-04 | 50.00 | 3 |
20 | 10000001 | 20005 | 2009-01-05 | 50.00 | 3 |
21 | 10000001 | 20005 | 2009-01-06 | 50.00 | 3 |
22 | 10000001 | 20005 | 2009-01-07 | 50.00 | 3 |
23 | 10000001 | 20005 | 2009-01-08 | 50.00 | 3 |
24 | 10000001 | 20006 | 2009-01-01 | 50.00 | 3 |
25 | 10000001 | 20007 | 2009-01-01 | 50.00 | 1 |
26 | 10000001 | 20007 | 2009-01-02 | 50.00 | 1 |
27 | 10000001 | 20007 | 2009-01-03 | 50.00 | 1 |
28 | 10000001 | 20008 | 2009-01-01 | 50.00 | 1 |
29 | 10000001 | 20008 | 2009-01-02 | 50.00 | 1 |
30 | 10000001 | 20008 | 2009-01-03 | 50.00 | 1 |
My Solution
Personally I found this challenge quite easy and very fun to do.
This query depends on first determining where the boundaries are. A given requirement is that Invoice numbers may not be broken. This is in fact not a factor that makes the query more complex in my eyes, but makes it easier. The first step is thus to determine how many rows (count) there are within a batchnumber, invoice combination. This will form the basis for the boundaries.
With tblA ( id, BatchNumber, InvoiceNumber, RWN ) AS
(
-- Step 1. Get the Count of Invoices Per BatchNumber/InvoiceNumber combination and in the
-- output generate a sequential id using the rownumber function
SELECT ROW_NUMBER() OVER( ORDER BY BatchNumber, InvoiceNumber ), Batchnumber,
InvoiceNumber, COUNT(InvId)
FROM @t
GROUP BY Batchnumber, InvoiceNumber
) ,
rt ( id, BatchNumber, InvoiceNumber, runtot , cappedtotal, groupnr ) AS
(
-- Step 2. Using recursive function, calculate a running total of the count of rows till the max
allowed of 10 rows.
-- Using this "running capped" total, which forms the final boundary , create a "running"
-- group number.
SELECT id, BatchNumber, InvoiceNumber, RWN, RWN , 1
FROM tblA
WHERE id = 1
UNION ALL
SELECT t.id, t.BatchNumber, t.InvoiceNumber, t.rwn + rt.runtot,
-- check for change in batchnumber as per rule = new batch
CASE WHEN t.BatchNumber = rt.BatchNumber
THEN
-- now checking capped totals.
CASE WHEN t.rwn + rt.cappedtotal < 10
-- accumulate as capped total is less than 10 rows
THEN t.RWN + rt.cappedtotal
-- initialise for new accumulation
ELSE t.RWN
END
ELSE t.RWN
END,
-- check for change in batchnumber as per rule = new batch
CASE WHEN t.BatchNumber = rt.BatchNumber
THEN
-- check if wihin capped totals
CASE WHEN t.rwn + rt.cappedtotal < 10
-- keep group number
THEN rt.groupnr
-- accumulate group nr.
ELSE rt.groupnr + 1
END
ELSE 1 -- reset running group total as change of batchnumber has occurred.
END
FROM tblA t
JOIN rt ON ( t.id = rt.id + 1 )
)
-- Combine with the original table to produce the result.
SELECT t.InvId, t.BatchNumber, t.InvoiceNumber, t.VisitDate, t.Amount, rt.groupnr
FROM @t t
JOIN rt rt ON rt.BatchNumber = t.BatchNumber
AND rt.InvoiceNumber = t.InvoiceNumber
And that was it, they key being initial boundaries set by Count of Rows within Invoices. Using these boundaries, generate running capped totals within a recursive CTE.
Geen opmerkingen:
Een reactie posten