Sources


Sources is your tool to pull in data you want to use that already exists. A source create a reference to a table or view that already exists and allows you to use it in your models. They are similar to models, but rather than define them in a .sql file associated to them, you define the source by specifying the path to the existing data.

Generating sources

Although we recommend understanding sources, we don't necessarily recommend writing them by hand. It's a time-consuming process and error-prone. In order to make it easier to generate boilerplate code for existing sources, you can use the Visual Studio Code onboarding flow to on onboard sources without having to write the YAML yourself. To do this, open the command palette and run the Quary: Onboarding command.

A source definition

Sources just like models are defined in project files, for example in models/staging/schema.yaml. Note here the fully qualified path to the table in the path property.

sources:
  - name: source_a
    path: project_id.dataset_id.source_a_table

Once a source is defined, you can reference it in a model just like you would any other model. For example, the above source can be referenced in a model like below.

SELECT * FROM q.source_a

In the above example, the fully qualified path to the source contains the project id, dataset id and table id. This is specific to BigQuery and is required to reference a table. If you are using a different database, you may not need to specify the project id and dataset id.

Tests and documentation

Just like models, sources can have tests and documentation and the description and columns properties are supported and work identically.

sources:
  - name: source_a
    path: project_id.dataset_id.source_a_table
    description: 'This is a source'
    columns:
      - name: column_a
        description: 'This is a column'
        tests:
          - type: unique

Equally just like you can reference a model in a SQL test, you can also reference a source in a SQL test.