Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Declarative Transformations API #356

Conversation

arumsey
Copy link
Collaborator

@arumsey arumsey commented May 16, 2024

Description

Introduce a higher-level Importer API that provides a declarative approach to transformations

  • offer a clearer API for developers to access that follows a consistent pattern
  • define a no-code configuration for most common transformation phases
  • allow for block-level transformation extensions
  • provide default metadata handling to avoid code duplication
  • opens the door for easier transformation automation
  • separates block transformations into their own modules

Related Issue

Fixes: #355

Dependent on: adobe/helix-importer#351

Motivation and Context

The AEM Importer import scripts are a powerful tool for manipulating a DOM into an Edge Delivery compatible format. These scripts however need to be written by developers with a high degree of knowledge of DOM APIs. The addition of a declarative transformation API on top of the existing low-level import script opens mor doors for less technical users to begin importing content by simply defining a collection of CSS selectors. Additionally, a no-code approach to defining import rules also allows them to be more easily created by automation and for them to be POSTED to a service to enable long running imports.

This feature does not introduce any breaking changes to the AEM Importer. The import script however can now be a JSON file or a JS object. This JSON structure (the ruleset) can then be passed to a createImporter factory that generates a valid import script. The Transformer class is able to consume a ruleset and runs through a series of phases to clean up the DOM and then generate any desired blocks through the use of CSS selector logic.

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have signed the Adobe Open Source CLA.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

- added `Transformer` class
- added import script that can consume transformation objects
- added support for import JSON object
- do not allow custom sections mapping script to bleed into other tabs
@arumsey arumsey marked this pull request as draft May 16, 2024 19:33
@@ -18,14 +18,13 @@ const initOptionFields = (parent) => {
const optionFields = getOptionFields(parent);
optionFields.forEach((field) => {
const value = localStorage.getItem(`option-field-${field.id}`);
if (value !== null) {
if (!field.classList.contains('locked') && value !== null) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix to not have the custom sections mapping script bleed into other tabs.

}
}

export default class Transformer {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Transformer could be added to helix-importer instead of being bolted on here. This class is where all the logic for processing a import ruleset lives. There are a number of phases that every import can participate in. The blocks phase is the most involved as there are a number of different parsing options that can be leveraged depending on the complexity of a site.

.replace(/[^a-z0-9/]/gm, '-');
};

const createImporter = (rules) => ({
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A factory for creating compatible import scripts based on an import ruleset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createImporter may be an overloaded term so I may try to come up with a more descriptive name.

Comment on lines 67 to 73
const isImportScript = Object.keys(mod.default).some((key) => key === 'transformDOM' || key === 'transform');
if (isImportScript) {
$this.projectTransform = mod.default;
} else {
// declarative transformation
$this.projectTransform = createImportScript(mod.default);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detect the type of import script and use the createImporter factory if needed.

Comment on lines 82 to 89
const loadJson = (json) => {
try {
const importCfg = JSON.parse(json);
$this.projectTransform = createImportScript(importCfg);
} catch (err) {
console.error('Invalid transformation JSON');
}
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An import script URL can now also be JSON for a no-code option.

@@ -0,0 +1,10 @@
/* global WebImporter */

export default function parse(element, { document, params: { cells } }) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A generic block parser that is able to generate a Block table from a config object or two-dimensional array of cell rules.

// phase 3: block creation
blocks.forEach((blockCfg) => {
const { type, selectors = [], parse, target = 'replace', params = {} } = blockCfg;
const parserFn = parse || parsers[type] || parsers['block'];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detect the type of parser that should be used.

// parse the element into block items
const items = parserFn.call(this, element, source);
// create the block
const block = WebImporter.Blocks.createBlock(document, {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new function added to helix-importer. 👉 adobe/helix-importer#351

* @param element Root element to query from
* @param params Object of selector conditions
*/
static buildBlockConfig(element, params) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function to map an object of name/value pairs with selector logic into a valid block config block.

* @param element
* @param cells
*/
static buildBlockCells(element, cells) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function to map a two-dimensional array of selectors into actual elements that should be added to a block.

@arumsey
Copy link
Collaborator Author

arumsey commented May 16, 2024

Import Ruleset Example

{
  "root": "main",
  "cleanup": {
    "start": [
      ".cookie-status-message",
      ".breadcrumbs",
      ".messages",
      ".sidebar",
      "h1 + ul.instruments-menu",
      "h1"
    ]
  },
  "blocks": [
    {
      "type": "metadata",
      "target": "append",
      "params": {
        "metadata": {
          "keywords": "[name=\"keywords\"]",
          "Publication Date": "[property=\"og:article:published_time\"]",
          "category": [
            [":scope:has(.is-blog .post-list) .webinar-speaker-img", "Webinars"],
            [":scope:has(.is-blog .post-list)", ""]
          ],
          "series": [
            [":has(.is-blog .post-list) .webinar-speaker-img", ""]
          ],
          "eventDate": [
            [":has(.is-blog .post-list) .webinar-speaker-img", ""]
          ],
          "speakers": [
            [":has(.is-blog .post-list) .webinar-speaker-img", ".webinar-speaker-img + strong"]
          ]
        }
      }
    },
    {
      "type": "overview",
      "selectors": [
        ".entry div:has(div > p > img)",
        ".entry > div > div:first-of-type:has(div > img)"
      ],
      "params": {
        "cells": [
          ["div:has(p > img)", ":scope > div:last-child"]
        ]
      }
    },
    {
      "type": "columns",
      "selectors": [
        ".entry > .about-content",
        ".desc-img-wrapper:has(> :nth-child(2):last-child)"
      ]
    }
  ]
}

@arumsey arumsey marked this pull request as ready for review May 21, 2024 14:06
- add defaults to metadata parser
- refactor to `TransformFactory`
- add comments to `TransformFactory`
- add declarative transformation docs
- change `target` prop for blocks to `insertMode`
@arumsey arumsey closed this May 23, 2024
@arumsey arumsey deleted the transformation-object branch May 23, 2024 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant