16.4. Connecting and Editing Data Across Layers

Ability to connect data from different layers is one of the duties of a GIS software. Such a connection can be based on the spatial relationship between the features, or on their shared attributes. QGIS provides tools to handle any of these associations, such as:

  • Processing algorithms that can create a new layer as a result of the connection, namely Join attributes by location, Join attributes by nearest, Join attributes by field value, …

  • SQL queries to create a new layer from the DB Manager or as a virtual layer

  • Joins properties or relations settings that temporarily extend attributes of features in a given layer, with those of features in another layer based on some matching attribute(s).

    Joins and relations are technical concepts borrowed from databases to get the most out of data stored in tables by combining their contents. The idea is that features (rows) of different layers (tables) can be associated to each other. The number of rows which are matching each other can be of any value (zero, one, many).

16.4.1. Joining features between two layers

Joins in QGIS allow you to associate features of the current layer to features from another loaded vector layer. Whether they are spatially enabled and the type of geometry do not matter. The join is based on an attribute that is shared by the layers, in a one-to-one relationship.

To create a join on a layer (identified below as target layer):

  1. Go to the layer Properties ► join Joins tab

  2. Click the symbologyAdd Add new join button. The Add vector join dialog appears.

  3. Select the Join layer you want to connect with the target vector layer

  4. Specify the Join field (from the join layer) and the Target field (from the target layer). These are the fields that are used to find matching feature in both layers hence they should have values in common.

  5. Press OK and a summary of selected parameters is added to the Join panel.

../../../_images/join_attributes.png

Fig. 16.100 Join an attribute table to an existing vector layer

The steps above will create a join, where ALL the attributes of the first matching feature in the join layer is added to the target layer’s feature. The following logic is used to pair features during a join process:

  • All the features in the target layer are returned, regardless they have a match

  • If the target field contains duplicate values, these features are assigned the same feature from the join layer.

  • If the join field contains duplicate matching values, only the first fetched feature is picked.

Note

Joins in QGIS are based on a single field matching so most of the times, you would want to make sure that values in the matchable fields are unique.

QGIS provides some more options to tweak the join:

  • checkbox Cache join layer in virtual memory: allows you to cache values in memory (without geometries) from the joined layer in order to speed up lookups.

  • unchecked Create attribute index on the join field to speed up lookups

  • unchecked Dynamic form: helps to synchronize join fields on the fly, according to the Target field. This way, constraints for join fields are also correctly updated. Note that it’s deactivated by default because it may be very time consuming if you have a lot of features or a myriad of joins.

  • If the target layer is editable, then some icons will be displayed in the attribute table next to fields, in order to inform about their status:

    • joinNotEditable: the join layer is not configured to be editable. If you want to be able to edit join features from the target attribute table, then you have to check the option checkbox Editable join layer.

    • joinedLayerNotEditable: the join layer is well configured to be editable, but its current status is read only.

    • joinHasNotUpsertOnEdit: the join layer is editable, but synchronization mechanisms are not activated. If you want to automatically add a feature in the join layer when a feature is created in the target layer, then you have to check the option checkbox Upsert on edit. Symmetrically, the option checkbox Delete cascade may be activated if you want to automatically delete join features.

  • unchecked Joined fields: instead of adding all the fields from the joined layer, you can specify a subset.

  • unchecked Custom field name prefix for joined fields, in order to avoid name collision

16.4.2. Setting relations between multiple layers

Unlike joins that define a one-to-one link between features across two layers, relations help you build interconnections between multiple features across two or more layers. As such, relations are project level settings and are set in Project ► Properties ► relations Relations tab. From there, you can:

  • symbologyAdd Add relation whose type can be:

    Note

    There is no simple way yet to edit a non-polymorphic relation once it has been created. Only the name can be edited with a double-click. For any other parameters of such a relation you will have to remove and recreate it.

  • symbologyAdd Discover relations: QGIS is able to discover existing relations from supported database formats (PostgreSQL, GeoPackage, ESRI File Geodatabase, …). This can be a good way to ease the relations definition.

  • symbologyRemove Remove relation

../../../_images/project_relations.png

Fig. 16.101 Relations tab

16.4.2.1. One to many (1-N) relations

As an example you have a layer with all regions of Alaska (polygon) which provides some attributes about its name and region type and a unique id (which acts as primary key).

Then you get another point layer or table with information about airports that are located in the regions and you also want to keep track of these. If you want to add them to the regions layer, you need to create a one to many relation using foreign keys, because there are several airports in most regions.

../../../_images/regions_with_airports.png

Fig. 16.102 Alaska region with airports

Layers and keys

QGIS makes no difference between a table and a vector layer. Basically, a vector layer is a table with a geometry. So you can add your table as a vector layer. To demonstrate the 1-n relation, you can load the regions and airports layers in the sample dataset. In practice, each airport belongs to exactly one region while each region can have any number of airports (a typical one to many relation).

which has a foreign key field (fk_region) to the layer regions.

In addition to the attributes describing the airports, the aiports layer has another field fk_region which acts as a foreign key (if you have a database, you will probably want to define a constraint on it). This fk_region field will always contain an id of a region. It can be seen like a pointer to the region it belongs to.

All you have to do is to tell QGIS the relation between the layers so that you can design a custom edit form for editing and QGIS takes care of the setup. It works with different providers (so you can also use it with shape and csv files).

Defining 1-N relations

The first thing we are going to do is to let QGIS know about the relations between the layers. This is done in Project ► Properties…. Open the Relations tab and click on symbologyAdd Add Relation.

  • Name is going to be used as a title. It should be a human readable string describing what the relation is used for. We will just call say airport_relation in this case.

  • Referenced Layer (Parent) also considered as parent layer, is the one with the primary key, pointed to, so here it is the regions layer. You need to define the primary key of the referenced layer, so it is ID.

  • Referencing Layer (Child) also considered as child layer, is the one with the foreign key field on it. In our case, this is the airports layer. For this layer you need to add a referencing field which points to the other layer, so this is fk_region.

    Note

    Sometimes, you need more than a single field to uniquely identify features in a layer. Creating a relation with such a layer requires a composite key, i.e. more than a single pair of matching fields. Use the symbologyAdd Add new field pair as part of a composite foreign key button to add as many pairs as necessary.

  • Id will be used for internal purposes and has to be unique. You may need it to build custom forms. If you leave it empty, one will be generated for you but you can assign one yourself to get one that is easier to handle

  • Relationship strength sets the strength of the relation between the parent and the child layer. The default Association type means that the parent layer is simply linked to the child one while the Composition type allows you to duplicate also the child features when duplicating the parent ones and on deleting a feature the children are deleted as well, resulting in cascade over all levels (means children of children of… are deleted as well).

../../../_images/regions_airports_mapping.png

Fig. 16.103 Adding a relation between regions and airports layers

From the Relations tab, you can also press the symbologyAdd Discover Relation button to fetch the relations available from the providers of the loaded layers. This is possible for layers stored in data providers like PostgreSQL or SpatiaLite.

Forms for 1-N relations

Now that QGIS knows about the relation, it will be used to improve the forms it generates. As we did not change the default form method (autogenerated), it will just add a new widget in our form. So let’s select the layer region in the legend and use the identify tool. Depending on your settings, the form might open directly or you will have to choose to open it in the identification dialog under actions.

../../../_images/airport_relation_dataview.png

Fig. 16.104 Identification dialog regions with relation to airports

As you can see, the airports assigned to this particular region are all shown in a table. And there are also some buttons available. Let’s review them shortly:

  • The toggleEditing button is for toggling the edit mode. Be aware that it toggles the edit mode of the airport layer, although we are in the feature form of a feature from the region layer. But the table is representing features of the airport layer.

  • The saveEdits button is for saving all the edits in the child layer (airport).

  • The capturePoint button lets you digitize the airport geometry in the map canvas and assigns the new feature to the current region by default. Note that the icon will change according to the geometry type.

  • The newTableRow button adds a new record to the airport layer attribute table and assigns the new feature to the current region by default. The geometry can be drawn later with the Add part digitizing tool.

  • The duplicateFeature button allows you to copy and paste one or more child features within the child layer. They can later be assigned to a different parent feature or have their attributes modified.

  • The deleteSelectedFeatures button deletes the selected airport(s) permanently.

  • The link symbol opens a new dialog where you can select any existing airport which will then be assigned to the current region. This may be handy if you created the airport on the wrong region by accident.

  • The unlink symbol unlinks the selected airport(s) from the current region, leaving them unassigned (the foreign key is set to NULL) effectively.

  • With the zoomToSelected button you can zoom the map to the selected child features.

  • The two buttons formView and openTable to the right switch between the table view and form view of the related child features.

If you use the Drag and Drop Designer for the regions feature, you can select which tools are available. You can even decide whether to open a new form when a new feature is added using Force hide form on add feature option. Be aware that this option implies that not null attributes must take a valid default value to work correctly.

../../../_images/airport_relation_formproperties.png

Fig. 16.105 Drag and Drop Designer for configure regions-airports relation tools

In the above example the referencing layer has geometries (so it isn’t just an alphanumeric table) so the above steps will create an entry in the layer attribute table that has no corresponding geometric feature. To add the geometry:

  1. Choose openTable Open Attribute Table for the referencing layer.

  2. Select the record that has been added previously within the feature form of the referenced layer.

  3. Use the addPart Add Part digitizing tool to attach a geometry to the selected attributes table record.

If you work on the airport table, the widget Relation Reference is automatically set up for the fk_region field (the one used to create the relation), see Relation Reference widget.

In the airport form you will see the formView button at the right side of the fk_region field: if you click on the button the form of the region layer will be opened. This widget allows you to easily and quickly open the forms of the linked parent features.

../../../_images/airport_attributes.png

Fig. 16.106 Identification dialog airport with relation to regions

The Relation Reference widget has also an option to embed the form of the parent layer within the child one. It is available in the Properties ► Attributes Form menu of the airport layer: select the fk_region field and check the Show embedded form option.

If you look at the feature dialog now, you will see that the form of the region is embedded inside the airports form and will even have a combobox, which allows you to assign the current airport to another region.

../../../_images/airport_attributes_expanded.png

Moreover if you toggle the editing mode of the airport layer, the fk_region field has also an autocompleter function: while typing you will see all the values of the id field of the region layer. Here it is possible to digitize a polygon for the region layer using the symbologyAdd button if you chose the option Allow adding new features in the Properties ► Attributes Form menu of the airport layer.

The child layer can also be used in the Select Features By Value tool in order to select features of the parent layer based on attributes of their children.

In Fig. 16.107, all the regions where the mean altitude of the airports is greater than 500 meters above sea level are selected.

You will find that many different aggregation functions are available in the form.

../../../_images/relation_select_by_value.png

Fig. 16.107 Select parent features with child values

16.4.2.2. Many-to-many (N-M) relations

N-M relations are many-to-many relations between two tables. For instance, the airports and airlines layers: an airport receives several airline companies and an airline company flies to several airports.

This SQL code creates the three tables we need for an N-M relationship in a PostgreSQL/PostGIS schema named locations. You can run the code using the Database ► DB Manager… for PostGIS or external tools such as pgAdmin. The airports table stores the airports layer and the airline table stores the airlines layer. In both tables few fields are used for clarity. The tricky part is the airports_airlines table. We need it to list all airlines for all airports (or vice versa). This kind of table is known as a pivot table. The constraints in this table force that an airport can be associated with an airline only if both already exist in their layers.

CREATE SCHEMA locations;

CREATE TABLE locations.airports
(
   id serial NOT NULL,
   geom geometry(Point, 4326) NOT NULL,
   airport_name text NOT NULL,
   CONSTRAINT airports_pkey PRIMARY KEY (id)
);

CREATE INDEX airports_geom_idx ON locations.airports USING gist (geom);

CREATE TABLE locations.airlines
(
   id serial NOT NULL,
   geom geometry(Point, 4326) NOT NULL,
   airline_name text NOT NULL,
   CONSTRAINT airlines_pkey PRIMARY KEY (id)
);

CREATE INDEX airlines_geom_idx ON locations.airlines USING gist (geom);

CREATE TABLE locations.airports_airlines
(
   id serial NOT NULL,
   airport_fk integer NOT NULL,
   airline_fk integer NOT NULL,
   CONSTRAINT airports_airlines_pkey PRIMARY KEY (id),
   CONSTRAINT airports_airlines_airport_fk_fkey FOREIGN KEY (airport_fk)
      REFERENCES locations.airports (id)
      ON DELETE CASCADE
      ON UPDATE CASCADE
      DEFERRABLE INITIALLY DEFERRED,
   CONSTRAINT airports_airlines_airline_fk_fkey FOREIGN KEY (airline_fk)
      REFERENCES locations.airlines (id)
      ON DELETE CASCADE
      ON UPDATE CASCADE
      DEFERRABLE INITIALLY DEFERRED
 );

Instead of PostgreSQL you can also use GeoPackage. In this case, the three tables can be created manually using the Database ► DB Manager…. In GeoPackage there are no schemas so the locations prefix is not needed.

Foreign key constraints in airports_airlines table can´t be created using Table ► Create Table… or Table ► Edit Table… so they should be created using Database ► SQL Window…. GeoPackage doesn’t support ADD CONSTRAINT statements so the airports_airlines table should be created in two steps:

  1. Set up the table only with the id field using Table ► Create Table…

  2. Using Database ► SQL Window…, type and execute this SQL code:

    ALTER TABLE airports_airlines
       ADD COLUMN airport_fk INTEGER
       REFERENCES airports (id)
       ON DELETE CASCADE
       ON UPDATE CASCADE
       DEFERRABLE INITIALLY DEFERRED;
    
    ALTER TABLE airports_airlines
       ADD COLUMN airline_fk INTEGER
       REFERENCES airlines (id)
       ON DELETE CASCADE
       ON UPDATE CASCADE
       DEFERRABLE INITIALLY DEFERRED;
    

Then in QGIS, you should set up two one-to-many relations as explained above:

  • a relation between airlines table and the pivot table;

  • and a second one between airports table and the pivot table.

An easier way to do it (only for PostgreSQL) is using the Discover Relations in Project ► Properties ► Relations. QGIS will automatically read all relations in your database and you only have to select the two you need. Remember to load the three tables in the QGIS project first.

../../../_images/airports_airlines_relation.png

Fig. 16.108 Relations and autodiscover

In case you want to remove an airport or an airline, QGIS won’t remove the associated record(s) in airports_airlines table. This task will be made by the database if we specify the right constraints in the pivot table creation as in the current example.

Note

Combining N-M relation with automatic transaction group

You should enable the transaction mode in Project Properties ► Data Sources ► when working on such context. QGIS should be able to add or update row(s) in all tables (airlines, airports and the pivot tables).

Finally we have to select the right cardinality in the Layer Properties ► Attributes Form for the airports and airlines layers. For the first one we should choose the airlines (id) option and for the second one the airports (id) option.

../../../_images/airports_airlines_relation_formproperties.png

Fig. 16.109 Set relationship cardinality

Now you can associate an airport with an airline (or an airline with an airport) using Add child feature or Link existing child feature in the subforms. A record will automatically be inserted in the airports_airlines table.

../../../_images/add_airport_airline.png

Fig. 16.110 N-M relationship between airports and airlines

Note

Using Many to one relation cardinality

Sometimes hiding the pivot table in an N-M relationship is not desirable. Mainly because there are attributes in the relationship that can only have values when a relationship is established. If your tables have a geometry field, it could be interesting to activate the On map identification option (Layer Properties ► Attributes Form ► Available widgets ► Fields) for the foreign key fields in the pivot table.

Note

Pivot table primary key

Avoid using multiple fields in the primary key in a pivot table. QGIS assumes a single primary key so a constraint like constraint airports_airlines_pkey primary key (airport_fk, airline_fk) will not work.

16.4.2.3. Polymorphic relations

The purpose

Polymorphic relations are special case of 1-N relations, where a single referencing (document) layer contains the features for multiple referenced layers. This differs from normal relations which require different referencing layer for each referenced layer. A single referencing (document) layer is achieved by adding an adiditonal layer_field column in the referencing (document) layer that stores information to identify the referenced layer. In its most simple form, the referencing (document) layer will just insert the layer name of the referenced layer into this field.

To be more precise, a polymorphic relation is a set of normal relations having the same referencing layer but having the referenced layer dynamically defined. The polymorphic setting of the layer is solved by using an expression which has to match some properties of the referenced layer like the table name, layer id, layer name.

Imagine we are going to the park and want to take pictures of different species of plants and animals we see there. Each plant or animal has multiple pictures associated with it, so if we use the normal 1:N relations to store pictures, we would need two separate tables, animal_images and plant_images. This might not be a problem for 2 tables, but imagine if we want to take separate pictures for mushrooms, birds etc.

Polymorphic relations solve this problem as all the referencing features are stored in the same table documents. For each feature the referenced layer is stored in the referenced_layer field and the referenced feature id in the referenced_fk field.

Defining polymorphic relations

First, let QGIS know about the polymorphic relations between the layers. This is done in Project ► Properties…. Open the Relations tab and click on the little down arrow next to the symbologyAdd Add Relation button, so you can select the Add Polymorphic Relation option from the newly appeared dropdown.

../../../_images/polymorphic_relation_properties.png

Fig. 16.111 Adding a polymorphic relation using documents layer as referencing and animals and plants as referenced layers.

  • Id will be used for internal purposes and has to be unique. You may need it to build custom forms. If you leave it empty, one will be generated for you but you can assign one yourself to get one that is easier to handle

  • Referencing Layer (Child) also considered as child layer, is the one with the foreign key field on it. In our case, this is the documents layer. For this layer you need to add a referencing field which points to the other layer, so this is referenced_fk.

    Note

    Sometimes, you need more than a single field to uniquely identify features in a layer. Creating a relation with such a layer requires a composite key, i.e. more than a single pair of matching fields. Use the symbologyAdd Add new field pair as part of a composite foreign key button to add as many pairs as necessary.

  • Layer Field is the field in the referencing table that stores the result of the evaluated layer expression which is the referencing table that this feature belongs to. In our example, this would be the referenced_layer field.

  • Layer expression evaluates to a unique identifier of the layer. This can be the layer name @layer_name, the layer id @layer_id, the layer’s table name decode_uri(@layer, 'table') or anything that can uniquely identifies a layer.

  • Relationship strength sets the strength of the generated relations between the parent and the child layer. The default Association type means that the parent layer is simply linked to the child one while the Composition type allows you to duplicate also the child features when duplicating the parent ones and on deleting a feature the children are deleted as well, resulting in cascade over all levels (means children of children of… are deleted as well).

  • Referenced Layers also considered as parent layers, are those with the primary key, pointed to, so here they would be plants and animals layers. You need to define the primary key of the referenced layers from the dropdown, so it is fid. Note that the definition of a valid primary key requires all the referenced layers to have a field with that name. If there is no such field you cannot save a polymorphic relation.

Once added, the polymorphic relation can be edited via the Edit Polymorphic Relation menu entry.

../../../_images/polymorphic_relations.png

Fig. 16.112 Preview of the newly created polymorphic relation and its child relations for animals and plants.

The example above uses the following database schema:

CREATE SCHEMA park;

CREATE TABLE park.animals
(
   fid serial NOT NULL,
   geom geometry(Point, 4326) NOT NULL,
   animal_species text NOT NULL,
   CONSTRAINT animals_pkey PRIMARY KEY (fid)
);

CREATE INDEX animals_geom_idx ON park.animals USING gist (geom);

CREATE TABLE park.plants
(
   fid serial NOT NULL,
   geom geometry(Point, 4326) NOT NULL,
   plant_species text NOT NULL,
   CONSTRAINT plants_pkey PRIMARY KEY (fid)
);

CREATE INDEX plants_geom_idx ON park.plants USING gist (geom);

CREATE TABLE park.documents
(
   fid serial NOT NULL,
   referenced_layer text NOT NULL,
   referenced_fk integer NOT NULL,
   image_filename text NOT NULL,
   CONSTRAINT documents_pkey PRIMARY KEY (fid)
);