Automating the Correction of Data Reporting Errors

Nicholas Jaso
November 14, 2022

The Data Problem

Consumer brands sometimes implement custom reporting solutions that have unintended downstream consequences across all of their reporting. 

One reporting solution implemented by a home goods brand pulled shipping, oversize shipping, and specialty shipping (in-home setup) as line items into Shopify. For instance:

  • Customer 1 adds item A. Item A is large and requires oversized shipping.
  • Custom 2 adds item B. Item B is large and likewise requires oversized shipping, but it is also a complicated assembly item, so it requires specialty shipping.

Unfortunately, this reporting solution inflated the unit counts in the platform reporting. Any time a user would create a report, they had to exclude those specific oversized and specialty shipping line items.

Brands try to account for the downstream impact of these solutions by creating human-managed data rules. However, human error happens: this means more time spent on cleaning up data errors and managing business rules. 

The amount of effort required to maintain these manual data rules is a huge pain point for brands, as are other consequences, such as tax headaches and inaccurate AOV.

The Data Solution

We implemented a custom solution that identifies when an order has oversized or specialty shipping associated with it and pulls those values as new fields.

Simultaneously, the customization identifies these records and removes them before transformation takes place, resulting in a data set that the brand can use to properly report on order analysis, LTV, ROI, and AOV as well as correct the unit inflation in end reporting.

With this approach, we eliminate the need for human automation and prevent rework from impacting brand merchandising teams. Additionally, by pulling in these fields separately, we can answer a wider array of questions than prior to the implementation, including:

  • How much are we receiving in shipping cost by shipping type over a given time period?
  • Which products, based on AOV and returns, should we consider increasing the specialty shipping fees for?
  • What is the percentage of oversized and specialty shipping that resulted in overstating gross sales?

Expectations, Requirements, and Table Construction

Estimated build time: 10-20 hours. The build time depends on the initial structure of the data and how much transformation is needed. 

This specific solution utilized NetSuite as the source of truth, so we were able to isolate the oversized and specialty shipping without much issue.

Necessary familiarity: 

Table construction and notes:

  • We search for all possible item IDs that may relate to oversized or speciality shipping
  • Starting with the netsuite.transactions and netsuite.transaction.lines, we LEFT JOIN on netsuite.transaction_lines.item_id to get a count of items.
  • In this specific solution, the brand’s setup could not identify which product(s) the oversized and specialty shipping fees go toward, so we allocated the fees across the item lines equally. 
  • We then LEFT JOIN the resulting fee_allocation table with our Order Line Revenue Table to pull related order line ID fields and allow this table to JOIN anywhere order data is stored in the database.

Relevant Data and Considerations for This Analysis

You’ll need access to NetSuite, DRP (Daasity Reporting Platform), and Order Line Revenue.

  • NetSuite Database Integration to read the transaction line level data
  • DRP Access for transformation reporting
  • Order Line Revenue reporting

NetSuite Database Integration

We first start by utilizing Daasity’s NetSuite Data Integration. This integration and associated read access is required, otherwise we would be unable to pull data for any items, transactions or transaction_lines. 

From the NetSuite database, we are able to pull in a list of specialty and oversized shipping items by identifying the items based on criteria set by the brand or the results of our investigation. 


WITH
	specialty_oversize_shipping_item_id AS
 (
  SELECT item_id
       , displayname
       , CASE
         WHEN displayname ILIKE '%specialty%shipping%' THEN 'Specialty Shipping'
         WHEN displayname ILIKE '%specialty%shipping%' THEN 'Over Size Shipping'
         ELSE NULL               
        END AS fee_type
      FROM netsuite.items
      WHERE displayname ILIKE '%specialty%ship%' --Logic identified by brand to find shipping
      OR displayname ILIKE '%over%size%' --Logic identified by brand to find shipping
      GROUP BY item_id
             , displayname
  )

We then determine how many units are in each transaction. Later, we will use this count to determine how much of the specialty and oversize shipping amounts to allocate to each shipping line. 

We INNER JOIN the transaction_lines table to the netsuite.transaction table, then LEFT JOIN the specialty_oversize_shipping_item_id_search table created above.

This query excludes all transaction_lines related to the specialty and oversized shipping line items, and  it will give us a count of the items in each transaction excluding those shipping lines.


units_in_transaction AS 
(
   SELECT t.transaction_id -- unique id for the transaction/order level
        , t.tranid -- unique id for the transaction line itself
        , t.create_date        
        , SUM(tl.item_count) AS sum_items   
    FROM netsuite.transactions t   
    INNER JOIN netsuite.transaction_lines tl ON t.transaction_id = tl.transaction_id   
    LEFT JOIN specialty_oversize_shipping_item_id i ON tl.item_id = i.item_id
    WHERE tl.account_id IN ('/*put accounting account IDs here*/') -- the specialty and oversize shipping were in different account_ids
    AND t.transaction_type = 'Sales Order'
    AND i.item_id IS NULL   
    GROUP BY t.transaction_id
      , t.tranid
      , t.create_date
 )


Next, we combine the units_in_transactions table to pull the transaction item count, then combine the specialty_oversize_item_id table with the netsuite.transaction_lines table to pull the amount of each shipping fee at the transaction level.  

By using the identifier for fee_type created in the specialty_oversize_shipping_item_id, we can determine the dollar amount of each fee type by the specific transaction. 


fee_summary AS 
 (
  SELECT u.transaction_id -- unique id for the transaction/order level
       , u.tranid -- unique id for the transaction line itself
       , u.create_date
       , u.sum_items -- number of units in the order excluding oversize and specialty shipping lines
       , SUM(CASE
              WHEN i.fee_type = 'Specialty Shipping' THEN (tl.amount)
              ELSE NULL
             END) AS oversize_delivery_fee
       , SUM(CASE
              WHEN i.fee_type = 'Over Size Shipping' THEN (tl.amount)
              ELSE NULL
             END) AS speciality_shipping_delivery_fee
   FROM units_in_transaction u
   LEFT JOIN netsuite.transaction_lines tl ON tl.transaction_id = u.transaction_id
   INNER JOIN specialty_oversize_shipping_item_id i ON i.item_id = tl.item_id
   GROUP BY u.transaction_id
          , u.tranid
          , u.create_date
          , u.sum_items
          , item_shipping_method_id 
 ),

We then allocate the fee amounts based on the number of units in each transaction.

This step is important because not every item line in a transaction will have the same number of units. Let’s assume that we have line items A, B and C, and the total for oversized shipping is $100. 

Line item A has 2 units, Line item B has 2 units and line item C has 1 unit. In the SQL snippet below,  we calculate the fee per item to be $20. We can later JOIN this to any table where the transaction_id exists and multiply the units in the order lines by the fee per item amounts to get a rollup of $100. 


fee_allocation AS 
 (
   SELECT transaction_id
        , tranid
        , create_date
        , sum_items
        , oversize_delivery_fee
        , speciality_shipping_delivery_fee
        , (oversize_delivery_fee / sum_items) AS oversize_fee_per_item        , (speciality_shipping_delivery_fee / sum_items) AS specialty_shipping_fee_per_item   
    FROM fee_summary
  )

Lastly, we JOINed the fee_allocation table above to the Order Line Revenue table to determine the fees to allocate to each line item and provide the per item cost for the brand if they choose to directly query the table:


SELECT olr.order_id
     , olr.store_order_id
     , olr.order_line_id
     , olr.store_order_line_id
     , s.tranid
     , olr.num_units
     , s.sum_items
     , s.oversize_fee_per_item
     , SUM(olr.num_units * oversize_fee_per_item) AS oversize_fee --allocate fee per item across item lines, some lines have more than one unit so this method is used to allocate
     , s.specialty_shipping_fee_per_item
     , SUM(olr.num_units * specialty_shipping_fee_per_item) AS specialty_shipping_fee --allocate fee per item across item lines, some lines have more than one unit so this method is used to allocate
     , olr.__loaded_at AS __loaded_at
FROM drp.order_line_revenue olr
JOIN fee_allocation s ON s.transaction_id = olr.store_order_id
GROUP BY olr.order_id
       , olr.store_order_id
       , olr.order_line_id
       , olr.store_order_line_id
       , s.tranid
       , olr.num_units
       , s.sum_items
       , s.oversize_fee_per_item
       , s.specialty_shipping_fee_per_item
       , olr.__loaded_at

What you’ll see in Looker

Now that we’ve completed the transformation and identified the fees to allocate to each order line item, we can visualize this in Looker.

Visualization of Gross Sales, Specialty Shipping Fee, and Oversize Fee by Day for a given time period 

  • Green = Gross Sales 
  • Magenta = Specialty Shipping
  • Blue = Oversized shipping

When viewing this data set by percentage, we can easily see how accurate daily gross sales areThis answers one of our starting questions: How much were we overstating gross sales before this fix? 

  • 1-4%, usually closer to 4%, peaking at 9%. 

Some next steps would be to remove the fees from the gross sales in other transformation scripts or to subtract it within the LookML!

The best part? This logic now lives in the daily transformation! We no longer have to maintain the manual process to remove these fees from the gross sales in daily reporting.

< src="https://unpkg.com/@lottiefiles/lottie-player@latest/dist/lottie-player.js" defer>