Debugging, monitoring, and handing off to the analyst
Narrow down a failed run through three views (log · node status · data preview), then close the cycle by tidying up dataset permissions and handing off to an analyst.
Assume the pipeline you built so far has been running every morning on its own. One day an alert arrives: "yesterday's overnight run failed." This lesson walks through where to look first, what to re-run, and finally how to hand the datasets you produced over to an analyst cleanly. That's how the Path closes.
Three views on a failure — log, node status, data preview
Portal's run result screen shows three views at once.
- Execution log — A chronological text stream. The last line is usually the most important.
- Node status — On the canvas, each node is colored green (success), red (failure), or gray (not run).
- Data preview — For successful nodes only, the right panel shows a sample of the output rows.
Looking at all three together is what narrows the cause down quickly.
The first five minutes — where did it stop?
This order is the shortest path:
- Click the Run details link in the failure alert.
- On the canvas, find the one red node. There's usually just one — the node before it is green, the node after it is gray.
- Click the red node and open the Log tab in the right panel. The very last line often pinpoints which column at which value halted the run.
- In the same panel's Data preview tab, look at the output of the green node just before. Often you'll see at a glance that the shape of yesterday's input is different today.
These four steps narrow more than 80% of failures into two buckets: input schema change, or transient external-system outage.
Common failure categories and what to do next
| Category | Signal | Next action |
|---|---|---|
| Input schema change | The previous green node's output has a new column or a missing column | Reopen the connector mapping from lesson 02 and lock it with explicit column selection |
| Type mismatch | Log line like could not cast '...' to int64 | Add try/except + isolate the offending row in the code node from lesson 04 |
| Transient external outage | Log ends in a timeout or 5xx | No body change — press Re-run once. If it clears in minutes, close the ticket |
| Credential revocation | Log shows 401/403 or permission denied | Rotate the credential in the secrets store, or ask the ops team to restore permission |
| Data-quality threshold missed | A data-quality check node went red | Decide whether the threshold itself is too tight, or whether something is actually missing on the input |
After triage you'll do one of two things: re-run as is, or change code/mapping and re-run.
Partial re-run — only part of the same pipeline
When re-running from the start is expensive, portal offers Re-run from the failed node.
- Click the small arrow on the Re-run button in the upper-right of the canvas. Two options appear: from the start / from the failed node.
- From the failed node re-uses the previous green node's output as a cache and re-executes only the red node and everything after it.
If your source stage reads heavily from external systems, this option cuts operational load substantially.
Two signals to watch constantly in production
To not wait for the next failure alert, put two signals on one dashboard:
- Pipeline run history — A widget showing the last N days of run outcomes. Confirms at a glance that the same time slot stays green every day.
- Row-count trend of each mart dataset — A widget that changes color when yesterday's row count and today's differ by more than ±N%. Catches silent failures — the run succeeded but the data came in empty.
Both widgets can be built with the standard widgets in lessons 04 and 05 of the Analyst Path. The organization-level monitoring around permissions and audit (who ran what, when credentials were rotated) is covered separately in the Platform Admin Path.
Hand off to the analyst — two lines of permissions
When the pipeline is steady, the last step is handing the mart dataset you built over to an analyst. The handoff is usually short.
- Move the mart dataset into an analyst-facing collection (the counterpart of keeping source datasets in the engineer collection from lesson 02).
- Grant the analyst group Reader permission on that collection.
- Add one line to the dataset page's description: what time, at what grain, how often (e.g. "Refreshed daily at 03:07 · Cumulative orders through yesterday · Currency KRW").
- Set the dataset page's owner to yourself. The analyst then knows who to ping when their widget breaks.
After those four lines, your mart dataset shows up in the analyst's collection explorer, and widgets begin to stack on top. That's the final knot — engineer output = analyst input closes inside one portal.
Path complete — what's next
If you made it here, you've personally built one cycle starting from an external system and ending at an analyst's widget input. Recommended next steps:
- Workshop: Retail Inventory Intelligence — Every surface in this Path runs together inside a real scenario. The workshop is where engineer and analyst views meet in the same collection — a chance to see how the handoff step from the previous lesson actually plays out.
- Tutorial: Quick scenario import — Load one scenario from
dhub2-examplesinto your environment with a single command. Touches the same tools as step 1 of the workshop. - Analyst Path — See how your dataset shows up in the analyst's collection tree. Walking through the receiving side of a handoff once shortens every later mart build.
If you check off every lesson on this Path, it's recorded as complete. The "Continue learning" panel on the home page will recommend your next flow.
Nicely done.