What is it about?
This is an Action Element, in a pipeline, which consumes a single given input (which is a table in our platform) and uploads the data to a customer-provided PostgreSQL table. This means that each time this Action is executed, the whole contents of a table in the platform are dumped (inserted) into the customer-provided table.
What are the prerequisites to use this action?
The prerequisites to use this pipeline action are:
- The customer must be authorized into an already existing PostgreSQL instance of their own. Depending on where is the PostgreSQL instance deployed, they will have access to:
- The "host" (typically an IP address, or a network name).
- The "port" (it is a number, typically 5432).
- A database name. The name is typically chosen by the customer's development team.
- A desired table, with a desired structure and fields, must exist inside that instance, being such a table the target to dump the (in-platform) data into. It must have the appropriately desired columns. This will be determined as per customer's requirement, from one of their tables. This typically involves:
- The table name and the table schema. They are typically chosen by the customer's development team and the schema is typically "public" (by default, unless the customer deliberately chose another one).
- The fields. While this is also explicitly determined by the development team, client tools (e.g. DBeaver Community Edition) can be used to list the involved columns (given all the other data).
- A user with permissions to write in the desired table(s) must exist in that server instance. That user will be authorized for the customer to give their credentials to our platform. This user must have write permissions but, as a security recommendation, it must only have permissions into that desired table (and/or other tables that are meant to be populated by other usages of this action).
- A workspace, with a project, and a pipeline in that project. Such a pipeline must have a preceding integration node (or even better: any node that produces an SQL table in our platform). The PostgreSQL action must take one of those nodes as its only input. This will be described in the next section.
While all the data is known by the development team, in order to get the involved host and port the service's dashboards can be used, like Amazon:
https://lh4.googleusercontent.com/6mDaVfMxsYG58FWK9yOpsMfv-u0vIAUVAwVsxdryKWQ33TjoHHOf-dxVn3m2dnAYGDhPscYp7ulKoafwHqH29QCCfmBKjYvJ6dAwnA6qt435UwYP4dPTfw3wJfZ7-ePLjLxpNJh4
Example layout and configuration
A minimal sample pipeline layout would look like this:
https://lh6.googleusercontent.com/bSPLNI3GdLd2ALZ1KFDABqPUu9RoY02X7NpL-0TXjzlCJo0izkyOtUF3VCROvytRTg0jD4XfaGCtwGnqcuyEsseSLskbpVT6d__yCqHLwXu4F6S5gzic7PaBIPsgapJj4mOcZoij
Components:
- A source integration, to draw data from.
- A middle node, would sub-select or convert the input data to a new format or set of columns. The format would be chosen carefully since it will serve for the next step and must be a compatible set of columns to upload.
- The PostgreSQL node will have the task of uploading the data in the incoming format from the middle node.
By compatible, this must be understood:
- The target dump table will have many fields. Some of them may be optional (with a default value or automatic setting on absence), and some of them may be required (with no default value or automatic setting on absence).
- The middle node will have a specific set of columns in its generated table.