Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add draft for VM time synchronisation decisions #577

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

kgube
Copy link
Contributor

@kgube kgube commented Apr 24, 2024

@kgube kgube marked this pull request as draft April 24, 2024 11:11
@kgube kgube changed the title Add draft for VM clock synchronisation recommendations Add draft for VM time synchronisation decisions Aug 21, 2024
@kgube kgube marked this pull request as ready for review August 21, 2024 08:18
@scoopex
Copy link
Contributor

scoopex commented Aug 21, 2024

Had a discussion with @kgube about the motivation , goals and the contents of this DR.

Input/framework conditions that may be useful:
(needs to be evaluated)

  • Its good to add reference to software-systems which might be used by cusomers that use shared quorum algorithms and reference to the relevance of good system time
    (Zookeper, RabbitMQ, ETCD, Consul, Hazelcast, Ceph)
  • Its good to add a reference that using public (internet) NTP servers with the same S-NAT IP might lead to ratelimit situations if dozens of systems in a project are using the same ntp servers because the the NTP servers might see the same IP with dozens of NTP sessions
  • SCS environments itself should be operated with at least 3 central and CSP-local NTP sources (for Ceph, RabbitMQ, ...)
  • Whether overcommit or that a VM is not “scheduled” plays a role for the quality of the time synchronization with the virtualization used must not matter to the user
  • The CSP offers at least three local and not rate limited NTP servers that have at least 5 statically defined upstream stratum servers or local time sources with high quality
  • We can define a minimum quality that is based on the requirements of common systems and provides some reserve to keep popular systems running without problems (offset, jitter, frequency drift, ...)
  • The CSP ensures that a time with a minimum quality can be maintained in VMs with a reference setup
    • defined chrony setup/configuration that uses the min. 3 CSP NTP servers
    • this should be possible with all flavors (in some virtualization technologies the size of the virtual machine has impact to the scheduling of it and related to that to its time sychronization
    • the health check service activates several VMs with a single defined flavor distributed across the CSP landscape (e.g. 3) that run permanently and checks their quality to evaluate the compliance
  • Subordinate, but exciting would be a idea how to provide the flavor images with a standardized setup by default which can be used independent from the CSP (e.g. by using a standardized setup mechanism, or standardized references to the servers)


* OpenStack currently has no support for providing NTP servers to VMs under a fixed link-local address, like AWS and GCP are doing.
Such a feature could probably be implemented by re-using the metadata service IP (`169.254.169.254`) and the mechanisms to inject it into subnets.
If this feature becomes available in the future, it will be an attractive target for standardization, but until then it seems more sensible to focus on servers available through provider networks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was afaik encouraged during the discussion to actually work on this upstream.

This will not be implemented upstream by just waiting for it.

At least we should mention this in the decision record imho.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to reply. I have since updated the DR accordingly.

@kgube
Copy link
Contributor Author

kgube commented Oct 23, 2024

I discussed the potential upstream topic with Neutron Team, and created an RFE issue for it.

The topic will also be discussed during the PTG, it is currently scheduled for the 2014-10-24 15:00 - 16:00 UTC timeslot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants