Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental Font Transfer: Patch Subset #849

Closed
1 task done
svgeesus opened this issue May 26, 2023 · 20 comments
Closed
1 task done

Incremental Font Transfer: Patch Subset #849

svgeesus opened this issue May 26, 2023 · 20 comments
Assignees
Labels
Progress: blocked on dependency Paused while some other design review finishes up. Topic: fonts Related to fonts on the web, including web fonts and system fonts Venue: Web Fonts WG W3C Web Fonts Working Group

Comments

@svgeesus
Copy link

こんにちは TAG-さん!

I'm requesting a TAG review of [Incremental Font Transfer: Patch Subset.

Incremental transfer allows clients to load only the portions of the font they actually need, which speeds up font loads and reduces data transfer needed to load the fonts.

Further details:

  • I have reviewed the TAG's Web Platform Design Principles
  • Relevant time constraints or deadlines: no rush, hope to move to CR within 6 months
  • The group where the work on this specification is currently being done: W3C WebFonts WG
  • Major unresolved issues with or opposition to this specification: open issues
  • This work is being funded by:

You should also know that...

The Range Request method has more open issues and has been split off into a separate specification. It is not part of this TAG review request.

We are tracking Early wide review of IFT

We'd prefer the TAG provide feedback as (please delete all but the desired option):

🐛 open issues in our GitHub repo for each point of feedback

@mnot
Copy link
Member

mnot commented May 31, 2023

For the TAG's information -- there was a fair amount of negative feedback from HTTP folks on this specfication, and at least some (including me) don't consider it as addressed by the current specificaiton. See eg w3c/IFT#119.

I do think it's possible to address these concerns, but it would require much more consultation with the HTTP WG.

@garretrieger
Copy link

garretrieger commented May 31, 2023

As far as I'm aware all of the remaining open issues from the HTTP review are against the "Range Request" variant of incremental font transfer (list of non-range request tagged issues). Range request is still under development so it is not part of this review request.

Re: the caching issue you linked I've reopened that with a couple of additional things I think we should add based on the discussion so far. If there's additional changes you think we should make please let me know on the issue.

@torgo torgo added Topic: fonts Related to fonts on the web, including web fonts and system fonts Venue: Web Fonts WG W3C Web Fonts Working Group and removed Progress: untriaged labels Jun 1, 2023
@torgo torgo added this to the 2023-06-12-week milestone Jun 1, 2023
@svgeesus
Copy link
Author

svgeesus commented Jun 5, 2023

@mnot wrote:

there was a fair amount of negative feedback from HTTP folks on this specfication

Did you see that there is an explainer and that, in the Detailed Design Discussion section, two of the three sub-sections relate to the HTTP WG feedback?

(The rest, as @garretrieger mentioned, relates to the Range Request specification, which is not part of this TAG review)

@hober
Copy link
Contributor

hober commented Aug 4, 2023

Hi @svgeesus,

In the Stakeholder Feedback / Opposition section of the explainer, it says "WebKit: Positive". With my TAG hat off and my WebKitten hat on, I don't believe this to be accurate. It would be more accurate to say that WebKit is negative on this proposal (the patch subset). (We expect that, if it becomes widely deployed, we may find ourselves in the regrettable position of having to implement it, but this is not the same thing as supporting the proposal.) @litherum can provide further clarification if necessary.

@svgeesus
Copy link
Author

svgeesus commented Aug 4, 2023

Hi @hober thanks for the clarification of the WebKit position. @LeaVerou asked me to edit the explainer to link to published positions rather than rely on telcon and github discussions, which I am happy to do. Is there a link to the WebKit position? I will ask Firefox and Chromium for their official positions too, if they have published them.

I would welcome further clarification from @litherum because it has been a while since they attended a Fonts WG call. My understanding was that they argued that range request was needed because it can use a regular HTTP server rather than a special purpose one; but also that they accepted that the performance of range request was significantly worse on fast networks and terrible (10x worse than a static subsetted font, on 2G) on slow ones, and also broke HTTP caching as the HTTP WG pointed out.

In other words my understanding was that @litherum regretted the need for a specialized server, but understood that especially for the CJK market which currently has near-zero webfont deployment, patch subset was the only viable solution.

Meanwhile please pause this TAG review while we clarify stakeholder interest.

@hober
Copy link
Contributor

hober commented Aug 4, 2023

@LeaVerou asked me to edit the explainer to link to published positions rather than rely on telcon and github discussions, which I am happy to do.

That’s a great idea! I wish everybody did this. :)

Is there a link to the WebKit position?

I don’t see one in our standards-positions repo. Please request one!

I would welcome further clarification from @litherum because it has been a while since they attended a Fonts WG call.

I suspect that’s because we didn’t rejoin the WG when it most recently rechartered.

@vlevantovsky
Copy link

vlevantovsky commented Aug 5, 2023 via email

@svgeesus
Copy link
Author

svgeesus commented Aug 5, 2023

I suspect that’s because we didn’t rejoin the WG when it most recently rechartered.

No, because when we rechartered

Current participants are not required to rejoin this group because the
charter includes no new deliverables that require W3C Patent Policy
licensing commitments.

@vlevantovsky
Copy link

(Also, it would be good if someone could reformat Vlad's post above - it looks like GitHub didn't format the email response properly. I don't have edit powers in this repository.)

Edited.

@mnot
Copy link
Member

mnot commented Aug 14, 2023

Did you see that there is an explainer and that, in the Detailed Design Discussion section, two of the three sub-sections relate to the HTTP WG feedback?

Yes, I saw that some changes were made, and appreciate the effort. However, from a HTTP perspective this design is still not ready for standardization -- while it meets the needs of its proponents, it's use of HTTP doesn't take into account all aspects of the protocol, and I don't believe it will see good adoption, particularly by CDNs and other parties which would need to make substantial changes to their infrastructure to accommodate it.

If I were still on the TAG, here are the questions I'd be asking:

  • The explainer says 'Changes to the Open Font Format or OpenType specifications are out of scope.' Why? In particular, has anyone investigated whether doing so could address the issues with rendering subsets?
  • The proposal defines what amounts to a new HTTP extension that's specific to Web fonts. Has it undergone sufficient review by the relevant communities, and is it likely to be deployed?
  • Could existing protocol mechanisms have been used without the need for a new HTTP extension?
  • Is this extension likely to see reasonable adoption across the Web?
  • If new functionality is genuinely necessary, has it been designed in such a way as to allow generic use, so that other use cases can benefit -- thereby increasing deployment incentives?

@garretrieger
Copy link

So far in this issue discussion around the rationale for using patch subset has centered primarily around CJK, but it’s important to note that the use of IFT is extremely beneficial to many other font use cases. Here’s a few others that I consider to be pretty important:

  • Emoji and icon fonts. Similar to CJK these feature large numbers of codepoints where particular usages will only need a very small subset. For emoji fonts segmenting them into independent subsets is difficult due to the extensive use of glyph substitution based on codepoint sequences (eg. for skin tones).
  • Variable fonts. Particularly multi axis variable fonts can be prohibitively large. Typical usages will only need a small number of points in the font's full design space. IFT via patch subset can incrementally transfer variable font axis data (in addition to glyph data) allowing for downloading only what’s actually needed, while allowing it to be extended later. This is even more important when combined with fonts that also have large codepoint coverage (eg. CJK, Emoji, Icon) due to the multiplicative effect of the variation data.
  • Multi-script font families: most font families have coverage over many scripts. Due to this they are typically too large to deliver in their original format and need to be split into separate subsets one per script selected via unicode-range. However, this approach runs into issues when codepoints are shared between scripts (common for combining codepoints), which can lead to the wrong subset being used by the browser for a codepoint that exists in more than one subset. This leads to incorrect rendering of text. This is a very common problem, that we deal with constantly on the Google Fonts service (eg. Extended Latin and Viet subsets missing many characters googlefonts/nam-files#6, Macron position not correct in Kanit Regular 400 weight only google/fonts#6542, Inter font has problem with Vietnamese character google/fonts#3579, Bug in Josefin Sans font: Accent marks are out of position in Vietnamese text google/fonts#6245, Open Sans: combining acute accent does not render correctly google/fonts#2392). Unfortunately without something like IFT there isn’t a way to solve these issues without significantly increasing the amount of font bytes we deliver to users.
  • Future looking: the font format is being extended to allow inclusion of more than 64k glyphs. This is needed for the effective use of pan-unicode font families like the Noto families. IFT will be required to efficiently deliver these. Pan-unicode fonts are important in that they enable rendering support for all scripts/languages in unicode.

Given these issues the assertion that the current state of font loading is acceptable is not true. If you look at the web almanac’s section on language availability in webfonts you’ll see that scripts other than latin, cyrillic, and greek are significantly underrepresented. To quote:

“Sadly, other writing systems are much less prevalent. For example, Han (Chinese) is the 2nd most used writing system in the world (after Latin), but only supported by 0.2% of web fonts. Arabic is the third most used writing system, but again, only supported by 0.4% of web fonts. The reason that some of these writing systems are not used as web fonts is that they are very large due to the sheer number of glyphs they have to support, and the difficulty in subsetting them correctly.”

While range request and the newer IFTB proposal will work well for CJK, Emoji, and Icon fonts they aren’t as viable for the other cases I mentioned (multi-script families, variable fonts, and pan-unicode fonts). For example they won’t work well for Arabic font families (specifically called out in the above quote) due to the extremely complex nature of the fonts.

Patch subset is the only currently existing proposal that enables efficient loading for pretty much all of the problematic font loading cases.

Another thing to note is that long term we are planning on having both patch subset and range request/IFTB be standardized, the rationale being:

  • As Myle’s noted: while patch subset is extremely efficient, adoption by less sophisticated font hosters may be more challenging. In these cases using range request/IFTB once available will be an improvement over the current state and is better than not adopting any form of IFT.
  • Where font hosters are willing to go through the extra effort to adopt patch-subset it will make significant improvements to font loading for their users and will solve use cases that can’t be solved by range request/IFTB. Note: a high quality open source implementation of patch subset is already available and we plan to make plugins available for popular open source http servers.
  • Given that a significant amount of font usage on the web is through large font hosters such as Google Fonts: adoption of patch subset by those services will significantly improve font loading performance and the font rendering experience for a huge number of users.

To answer Mark’s questions:

  • The explainer says 'Changes to the Open Font Format or OpenType specifications are out of scope.' Why? In particular, has anyone investigated whether doing so could address the issues with rendering subsets?

We have recently started investigating a potential replacement for the range request proposal called “binned incremental font transfer”. This involves changes to the font format. While it’s an improvement over the range request proposal it will still not be able to match the performance of patch subset and will struggle with use in cases outside of CJK, Emoji, and Icon fonts. Due to the complex nature of fonts a smart server is pretty much necessary to efficiently transfer all classes of fonts. We do have the ability to change the font format if needed, but the problem isn’t the format but the nature of the fonts themselves.

  • The proposal defines what amounts to a new HTTP extension that's specific to Web fonts. Has it undergone sufficient review by the relevant communities, and is it likely to be deployed?

We have invited experts from the fonts community that participate in the web fonts working group in addition to representation from font hosting providers (Google Fonts and Adobe).

  • Could existing protocol mechanisms have been used without the need for a new HTTP extension?

We’ve recently updated the patch subset specification to utilize the more general purpose compression dictionary transport proposal to provide the patching functionality. Beyond that the only other extension proposed by patch subset is the introduction of a new header “font-patch-request” which is necessarily specific to the web font space. In #119 I’m currently investigating the potential to place the patch request message into a range request header instead.

  • Is this extension likely to see reasonable adoption across the Web?

Google Fonts which is the largest font hosting provider on the web and as such sees significant use across the web is planning to adopt incremental font transfer. I can’t speak for the plans of other font providers, but I suspect they run into similar issues that I described above of which IFT can help solve. Particularly services hosting (or planning to host) CJK fonts. Having this standardized and available in browsers should provide pretty good motivation for adoption by font hosters.

As an example Google Fonts was the first large scale adoption of variable fonts and as a result has significantly increased variable fonts usage on the web.

  • If new functionality is genuinely necessary, has it been designed in such a way as to allow generic use, so that other use cases can benefit -- thereby increasing deployment incentives?

Hopefully in my comments above I’ve provided sufficient motivation for why this technology is necessary to unlock web font usage for currently underrepresented writing systems.

The problem we’re solving is pretty specific to web fonts so the solution is specific to the space. HTTP range-request solves the more general problem of partially loading resources, but isn’t sufficient for web fonts. As noted above we are using an existing general purpose patching mechanism and only specializing where needed: in the message which describes the partial font subset.

Yes, I saw that some changes were made, and appreciate the effort. However, from a HTTP perspective this design is still not ready for standardization -- while it meets the needs of its proponents, it's use of HTTP doesn't take into account all aspects of the protocol, and I don't believe it will see good adoption, particularly by CDNs and other parties which would need to make substantial changes to their infrastructure to accommodate it.

We definitely appreciate your feedback so far it has resulted in changes to the specification for the better. I’d definitely like to keep iterating to address any remaining concerns that you have.

@mnot
Copy link
Member

mnot commented Aug 18, 2023

Due to the complex nature of fonts a smart server is pretty much necessary to efficiently transfer all classes of fonts.

That causes me concern, because HTTP systems scale well when the server doesn't need to be particularly smart. If we need to add complex processing to the server, it's best to make it as generic as possible so that it can be reused, enhancing the incentive to implement it (especially by actors like CDNs).

We have invited experts from the fonts community that participate in the web fonts working group in addition to representation from font hosting providers (Google Fonts and Adobe).

Understood. I think you need engagment from people who serve HTTP at scale, not just font folks. It may be good to have a more detailed discussion about the design details here in the HTTP WG or the HTTP Workshop, to give it visibility in those communities. I'm happy to help facilitate that if there's interest.

Having this standardized and available in browsers should provide pretty good motivation for adoption by font hosters.

Perhaps. What comes to mind here is Apple's experience with HLS for live streaming. When they wanted to do low latency (LL-HLS), they specified use of HTTP Server Push because that's what they thought reasonable, and because it was supportable on their server, their clients, and those they consulted. However, when they tried to get adoption by CDNs and other parties, there was strong pushback, because Server Push isn't widely supported, has some definitional issues and gaps for use in that case with intermediaries, and generally wasn't useful for CDNs to implement except for this special case. So, after considerable consultation, Apple changed LL-HLS so that it was more compatible with that infrastructure.

I'm not necessarily saying that CDNs won't implement IFT as specified -- their various product teams would need to be looped in to make that call. However, this does feel very similar, from my perspective, and on the surface, the incentive to implement efficient streaming video is much stronger.

The problem we’re solving is pretty specific to web fonts so the solution is specific to the space. HTTP range-request solves the more general problem of partially loading resources, but isn’t sufficient for web fonts. As noted above we are using an existing general purpose patching mechanism and only specializing where needed: in the message which describes the partial font subset.

Keep in mind that range requests themselves were originally for a very specific purpose: browsing PDFs without downloading the whole file.

I see some commonality between what you're doing and what the Braid folks are attempting; have you talked to them? HTTP API folks might also have some complimentary use cases; see the HTTP API WG, for example.

@garretrieger
Copy link

Due to the complex nature of fonts a smart server is pretty much necessary to efficiently transfer all classes of fonts.

That causes me concern, because HTTP systems scale well when the server doesn't need to be particularly smart. If we need to add complex processing to the server, it's best to make it as generic as possible so that it can be reused, enhancing the incentive to implement it (especially by actors like CDNs).

We have invited experts from the fonts community that participate in the web fonts working group in addition to representation from font hosting providers (Google Fonts and Adobe).

Understood. I think you need engagment from people who serve HTTP at scale, not just font folks. It may be good to have a more detailed discussion about the design details here in the HTTP WG or the HTTP Workshop, to give it visibility in those communities. I'm happy to help facilitate that if there's interest.

Yes, I'm definitely interested in engaging with those http groups. If you're able to help start those discussions that would be really helpful.

Having this standardized and available in browsers should provide pretty good motivation for adoption by font hosters.

Perhaps. What comes to mind here is Apple's experience with HLS for live streaming. When they wanted to do low latency (LL-HLS), they specified use of HTTP Server Push because that's what they thought reasonable, and because it was supportable on their server, their clients, and those they consulted. However, when they tried to get adoption by CDNs and other parties, there was strong pushback, because Server Push isn't widely supported, has some definitional issues and gaps for use in that case with intermediaries, and generally wasn't useful for CDNs to implement except for this special case. So, after considerable consultation, Apple changed LL-HLS so that it was more compatible with that infrastructure.

I'm not necessarily saying that CDNs won't implement IFT as specified -- their various product teams would need to be looped in to make that call. However, this does feel very similar, from my perspective, and on the surface, the incentive to implement efficient streaming video is much stronger.

Another avenue for adoption we’ve been thinking about would be to use edge compute that some CDN’s have available (for example Cloudflare workers). The IFT protocol is stateless so it should be a pretty good fit for that model. If there were an open source implementation of a worker it would be pretty easy for CDN users to plug in to provide IFT functionality for existing font assets.

The problem we’re solving is pretty specific to web fonts so the solution is specific to the space. HTTP range-request solves the more general problem of partially loading resources, but isn’t sufficient for web fonts. As noted above we are using an existing general purpose patching mechanism and only specializing where needed: in the message which describes the partial font subset.

Keep in mind that range requests themselves were originally for a very specific purpose: browsing PDFs without downloading the whole file.

I see some commonality between what you're doing and what the Braid folks are attempting; have you talked to them? HTTP API folks might also have some complimentary use cases; see the HTTP API WG, for example.

I do think it might still be possible to use range requests with a custom range unit, but I do have some concerns that I’ve mentioned in w3c/IFT#119 (comment). Primarily it looks like we’d have to be non-compliant with the range request specification in how we populate the “Content-Range” header in the response (maybe that’s fine w/ a custom range unit, definitely worth some more investigation).

Braid looks pretty interesting but I don’t think it's a good fit with what we’re doing. Compression dictionary transport is already a good fit for handling the delta encoding portion.

@torgo torgo modified the milestones: 2023-08-28-week, 2023-09-04-week Sep 3, 2023
@vlevantovsky
Copy link

Due to the complex nature of fonts a smart server is pretty much necessary to efficiently transfer all classes of fonts.

That causes me concern, because HTTP systems scale well when the server doesn't need to be particularly smart. If we need to add complex processing to the server, it's best to make it as generic as possible so that it can be reused, enhancing the incentive to implement it (especially by actors like CDNs).

While I am not arguing against any of the benefits of making server side processing as generic as possible for it to be reused, and offer additional incentives for it to be implemented, I'd like to bring an additional consideration as part of this discussion - usability.
Unicode-range subsetting has been one of the readily available font serving approaches, one that is capable of producing reduced size font files that are very cacheable and CDN friendly. This, however, still didn't noticeably improve user experience, and didn't provide enough incentives for authors to adopt for content using primarily CJK fonts. Our prior experiences with font serving that influenced the development of WOFF2, and the results of the user research summarized in PFE evaluation document clearly point to the fact that smart, font-specific approaches can yield significant benefits to users and authors alike, and should not be discarded for the sake of generalization if they benefit users in particular: consider users over authors over implementors over specifiers over theoretical purity.
(And, as a side note: WOFF2, initially conceived and born as a very font-specific compression technology, brought Brotli to the HTTP and the Web.)

@mnot
Copy link
Member

mnot commented Sep 11, 2023

I understand and appreciate the priority of constituencies; it inspired me (in part) to write this. However, to be blunt, if the implementers don't implement, it doesn't do much good to specify.

@vlevantovsky
Copy link

I understand and appreciate the priority of constituencies; it inspired me (in part) to write this. However, to be blunt, if the implementers don't implement, it doesn't do much good to specify.

... which almost verbatim repeats the arguments voiced back in the early days of web fonts against WOFF2. With hindsight been 20/20 - look where we are today! :)

@plinss plinss modified the milestones: 2023-09-04-week, 2023-09-25-week Sep 25, 2023
@torgo torgo added the Progress: blocked on dependency Paused while some other design review finishes up. label Sep 25, 2023
@torgo
Copy link
Member

torgo commented Sep 25, 2023

Hi - I've marked this as blocked since @svgeesus asked us to "pause." Suggest we hold discussion until the stakeholder interest is clarified... Thanks!

@plinss plinss removed this from the 2024-01-23-f2f-London milestone Mar 11, 2024
@torgo torgo added this to the 2024-06-17-week:d milestone Jun 16, 2024
@plinss plinss removed this from the 2024-06-17-week:d milestone Jun 24, 2024
@torgo torgo added this to the 2024-07-01-week:c milestone Jun 30, 2024
@svgeesus
Copy link
Author

Hi TAG!

WebFonts WG published a new draft and will do a second round of horizontal review. Then we will ask for TAG review again, once that is done. So I am closing this issue (the changes in the new draft, addressing review comments to date, are substantial enough that this is a whole new spec, and thus a whole new review is merited).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Progress: blocked on dependency Paused while some other design review finishes up. Topic: fonts Related to fonts on the web, including web fonts and system fonts Venue: Web Fonts WG W3C Web Fonts Working Group
Projects
None yet
Development

No branches or pull requests

9 participants