본문으로 건너뛰기
Data Engineer Path

Engineer workflow overview

The four surfaces engineers use in portal — Connectors, Pipelines, Codes, Datasets — mapped against dbt and Airflow in a single table.

7

Analysts draw screens on top of data that already exists. Engineers are responsible for how that data got there. In portal that responsibility splits across four surfaces, and the endpoint of all four is always a single artifact: a dataset.

Four surfaces an engineer lives in

  1. Connectors — The entry points wiring portal to external systems like databases, S3, REST APIs, and event streams. Authentication, hosts, and schema mapping are defined here.
  2. Pipelines — The workflow editor for connecting nodes. The standard representation of the source → transform → sink flow.
  3. Codes — Python or SQL snippets. Called from a transform node inside a pipeline, or runnable as a standalone unit.
  4. Datasets — Where the outputs of the three surfaces above land. At the same time, they are the inputs of the Analyst Path.

The key idea is that the endpoint of all four surfaces is always a dataset. The dataset the engineer produces shows up directly in the analyst's collection tree and feeds into widgets. The boundary between the two roles meets at the dataset.

Mapping to tools you already know (one-time only)

If you've run the same flow on dbt, Airflow, or Snowflake before, keep the following table in your head for the first 30 minutes — that's enough to start reading portal. From lesson 02 onward we use portal vocabulary only.

Tool you knowCorresponding portal surface
Airflow Connection · Source definitionConnectors
Airflow DAGPipelines
dbt model SQL · Python snippetCodes asset + transform/code node inside a pipeline
dbt source/seed/mart tableDatasets
Snowflake/BigQuery and other DWHsThe workspace datasets land in (built into portal)
GitHub Actions · CronPipeline schedule

Two notes to set the expectation against your existing stack:

  • DAG and models live on the same canvas. Instead of operating Airflow's task graph and dbt's model dependency graph side by side, portal shows source · transform · sink nodes on a single plane inside one pipeline.
  • Tables collapse into one dataset abstraction. The distinction between "source tables" and "mart tables" is expressed through collections, permissions, and tags. The engine type underneath isn't surfaced.

This mapping is here to help you decide which surface to open first, not as a claim that portal is a 1:1 replacement for dbt + Airflow.

How analysts and engineers divide the work

For the two roles to share one portal without stepping on each other, you need an explicit handshake about who owns what. A common split looks like this:

ResponsibilityUsually owned by
Pulling data from external systemsEngineer
Defining dataset schemas and typesEngineer
Grouping datasets into collections and granting accessEngineer (or owner)
Building widgets and dashboards on top of datasetsAnalyst
Pushing analysis results back into production systemsEngineer

The boundary varies by team size — one person often plays both roles in a small organization. This Path is written from the engineer side of that line: which surface owns which responsibility.

What to take away from this lesson

  • The names of the engineer's four surfaces and the inputs and outputs of each
  • A one-table mapping from familiar tools (dbt, Airflow) to portal surfaces
  • That the boundary with the analyst's responsibilities meets at the dataset

Next lesson

The next lesson opens the first surface — Connectors — by registering one external system once and landing production data as a portal dataset.