Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Location Verification Implementation Guidelines #85

Open
jlurien opened this issue Jul 31, 2023 · 27 comments
Open

Location Verification Implementation Guidelines #85

jlurien opened this issue Jul 31, 2023 · 27 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@jlurien
Copy link
Collaborator

jlurien commented Jul 31, 2023

Problem description
There is no clear guideline about how to implement the response to location-verification, when result is PARTIAL, and it is not obvious how to calculate the matchRate

Expected action
Trigger discussion and agree on some guidelines for common implementation.

An initial proposal is presented, to cover several scenarios that may happen. Also with comments and concerns.

Additional context

Word Document:
Location Verification Implementation Guidelines.docx

@jlurien jlurien added the documentation Improvements or additions to documentation label Jul 31, 2023
@jlurien jlurien self-assigned this Jul 31, 2023
@alpaycetin74
Copy link
Collaborator

I was always thinking of this as a ratio based on the intersection.
I think the match rate can be defined as: (intersection area) / (network provided area).
Since we know the center coordinates and radiuses of both circles, it is mathematically possible to calculate the intersection area.

image

@jlurien
Copy link
Collaborator Author

jlurien commented Jul 31, 2023

@alpaycetin74 Yes, that is exactly the proposal.

There are 2 problems:

  1. When network provided area is huge compared with request area, even when intersection area == request area, the ratio may very low (e.g. 1% or 2%)

  2. How would the client distinguish these two cases, as both would return PARTIAL + a similar matchRate:

image

From a technical point of view, implementing this formula is feasible. Concern is more about from a Product/UX perspective

@alpaycetin74
Copy link
Collaborator

In the figure on the left, the network provided area has a much larger radius than the request area, so there is a good chance the real location falls outside the requested area. I think it makes sense the matchrate is calculated as a small value in this case.
It is true the left and right figures look quite different, but maybe it doesn't matter much as far as accuracy is considered. I don't know :)

@jlurien
Copy link
Collaborator Author

jlurien commented Aug 28, 2023

Comments raised during meeting on August 1st:

  • Akos commented that the ratio may not be enough and maybe we should add some text or rational.
  • Cetin thinks that it makes sense to return a low percent since it is caused by a bad behaviour of the network.
  • Telefónica agrees but raise the problem with Product team approach and how the quality of the product is perceived.
  • We will see the possibility to split the partial values into different values as proposed by Jose and Akos.

@bigludo7
Copy link
Collaborator

bigludo7 commented Sep 11, 2023

Hello
Thanks for the document - This is helpful.

If I got it right @alpaycetin74 the 'formulae' to calculate the matchRate should be a bit more complex - something like this:
matchRate (between 0-1)= minimum [(intersection surface/requested surface), (intersection surface/networked checked surface)].

There is indeed a risk that often the matchRate is low. In your diagram in left @jlurien - when the requested surface is too small to the network surface checked I guess we should ask for a larger surface in the request.
This could be triggered when intersection surface= requested surface but intersection surface/networked checked surface < 0.4 for example.

Alternatively, another option could be to internally calculate the matchRate - not provide it - but instead in the response for True/False provides a 'confidenceRatio'.

If matchRate between 1 to 0.51 (for example) we answer True with confidence status = matchRate
If matchRate between 1 to 0.50 (for example) we answer False with confidence status = 1-matchRate

Happy to discuss this.

@jlurien
Copy link
Collaborator Author

jlurien commented Sep 11, 2023

We may all agree that cases where network accuracy is low are the problematic ones. The proposal to introduce a confidence rate along with True/False is interesting. With bad accuracy, matchRate will be low (below 0,50) and answer would be False in almost all cases.

However, with PARTIAL as it is now, there will be few cases where response is TRUE 100%, so we may have PARTIAL answers in most cases.

@alpaycetin74
Copy link
Collaborator

alpaycetin74 commented Sep 28, 2023

If I got it right @alpaycetin74 the 'formulae' to calculate the matchRate should be a bit more complex - something like this: matchRate (between 0-1)= minimum [(intersection surface/requested surface), (intersection surface/networked checked surface)].

I tend to think API performance is more related to the intersection area/network provided area ratio.
If the customer defines a requested area that is much larger than the network can provide (assume network can calculate accurately) , the min of the 2 ratios will be intersection / requested, and it will look pessimistic.

Other than that, returning FALSE with 1-matchRate is an interesting approach. Instead of saying "we are slightly sure it matchesé , we say "we are pretty much sure it won't match". It is a nice trick to give the impression we are confident in what we are doing :)

@JoachimDahlgren
Copy link
Collaborator

From a pure application developer perspective I cannot help thinking that it would be easier to understand the response if we returned TRUE with a circle where the center is the same as used for the requested area but the radius is set to cover the network provided area. The circle should not be smaller than was provided in the requested area.
location response

@sfnuser
Copy link
Collaborator

sfnuser commented Nov 2, 2023

I feel the specification should also document the math behind the calculations on arriving at the result so that the response is uniform across implementations. As we see here in this discussion there is lot of ambiguity and everytime I look at it my interpretation differs. @bigludo7 rightly started putting some math details here and it would be a good idea to document it in the spec for each of the cases.

@jlurien
Copy link
Collaborator Author

jlurien commented Nov 2, 2023

Agree that we should have the implementation guidelines documented, but first we have to agree on them. Definitely something required prior to release any v1

@Kevsy
Copy link
Collaborator

Kevsy commented Nov 23, 2023

I think the implementation guidelines document is very useful. But there is a problem: the criteria are worded differently to the YAML definition (02.0-wip). This leads to ambiguity, and at least in one case, a different decision.

Here's a comparison:

Criterion in YAML Criterion in guidelines Verification Result
"The network locates the device within the requested area" "Network Area within Request Area" TRUE
"The requested area may not match the area where the network locates the device" "No overlap" FALSE
"The requested area partially match the area where the network locates the device" "Request Area within Network Area Low network accuracy" or "Requested accuracy similar to Network Accuracy" or "Partial overlap Low network accuracy" or   "Partial overlap High network accuracy" PARTIAL
"The network may not be able to locate the device" "Network Area is not known" UNKNOWN

Two concerns:

  1. The scenario pictured here will return TRUE according to the YAML but PARTIAL according to the guidelines:

image

The YAML says:

The network locates the device within the requested area, the verification result is TRUE.

Since the request area is within the Network location area, the answer is TRUE – because the actual device location within the Request Area will always be part of - or 'within' - the superset of all locations in the Network location area.

But the guidelines document says:

_ “Request Area within Network Area” as “PARTIAL”. _

“Request Area within Network Area” is the scenario in the diagram above., so according to guidelines, the answer is PARTIAL.

Hence we have a different answer depending on the wording of the criterion (YAML vs Guidelines), and until both documents use the same wording we have ambiguity.

  1. The YAML criterion for FALSE is itself ambiguous

"The requested area may not match the area where the network locates the device" - "may not" is not definitive. "The requested area is outside the area where the network locates the device" is clearer

Recommendation: we use one set of criteria for both the guidelines and the YAML, and include the guideline illustrations in the API document.

@alpaycetin74
Copy link
Collaborator

Hello @Kevsy , the wording of the definitions in the release candidate is a bit different: #104
But I understand those may not be perfect, either. I'll try to propose a different wording.

@jlurien
Copy link
Collaborator Author

jlurien commented Nov 24, 2023

Thanks @Kevsy, as @alpaycetin74 mention, the wording in the spec has been (hopefully) improved in the latest PR. Regarding your specific comments, please see inline:

I think the implementation guidelines document is very useful. But there is a problem: the criteria are worded differently to the YAML definition (02.0-wip). This leads to ambiguity, and at least in one case, a different decision.

Here's a comparison:

Criterion in YAML Criterion in guidelines Verification Result
"The network locates the device within the requested area" "Network Area within Request Area" TRUE
"The requested area may not match the area where the network locates the device" "No overlap" FALSE
"The requested area partially match the area where the network locates the device" "Request Area within Network Area Low network accuracy" or "Requested accuracy similar to Network Accuracy" or "Partial overlap Low network accuracy" or   "Partial overlap High network accuracy" PARTIAL
"The network may not be able to locate the device" "Network Area is not known" UNKNOWN
Two concerns:

  1. The scenario pictured here will return TRUE according to the YAML but PARTIAL according to the guidelines:

image

The YAML says:

The network locates the device within the requested area, the verification result is TRUE.

Since the request area is within the Network location area, the answer is TRUE – because the actual device location within the Request Area will always be part of - or 'within' - the superset of all locations in the Network location area.

But the guidelines document says:

_ “Request Area within Network Area” as “PARTIAL”. _

“Request Area within Network Area” is the scenario in the diagram above., so according to guidelines, the answer is PARTIAL.

Hence we have a different answer depending on the wording of the criterion (YAML vs Guidelines), and until both documents use the same wording we have ambiguity.

We may need to clarify the wording if it leads to confusion, but the scenario in the picture should be PARTIAL. The yaml says for TRUE: "the network locates the device within the requested area", which means that the "network area" (= the area where the network locates the device) is within the "requested area", so it is wrong to assume that this is "Since the request area is within the Network location area,", cause is the opposite. It should not be TRUE, but PARTIAL, because the operator cannot assure that the device is exactly within the requested area, because the network area is bigger.

  1. The YAML criterion for FALSE is itself ambiguous

"The requested area may not match the area where the network locates the device" - "may not" is not definitive. "The requested area is outside the area where the network locates the device" is clearer

Recommendation: we use one set of criteria for both the guidelines and the YAML, and include the guideline illustrations in the API document.

The latest version of location-verification uses the wording:

  • When the requested area does not match the area where the network locates the device. the verification result is FALSE .

We may change that to "is outside" if it is more clear. As an AP from the last meeting, we are going to prepare a document for the implementation guidelines, trying to be more descriptive. Any suggestion to make it more understandable is very welcome, specially from native speakers.

@Kevsy
Copy link
Collaborator

Kevsy commented Nov 27, 2023

Thanks - yes, my point was that the (main branch) wording led to a different interpretation from the Guidelines.

It may help to have a formal declaration at the end of the YAML, after the 'plain language' definition, e.g.

Let R = the set of all possible locations within the Requested Area
Let N = the set of all possible locations within the Network Area

The following conditions lead to the verification results stated:

Notation Description Verification result
R ⊆ N R is a subset of, or equal to, N TRUE
R ⊃ N R is a superset of N PARTIAL
R ∩ N and R ⊄ N R and N intersect but R is not a subset of N PARTIAL
R ∩ N = ∅ R and N are disjoint FALSE

@bigludo7
Copy link
Collaborator

@Kevsy
and for PARTIAL result we use matchRate = Intersection (R,N) / N ? correct ?

@Kevsy
Copy link
Collaborator

Kevsy commented Nov 27, 2023

@bigludo7

and for PARTIAL result we use matchRate = Intersection (R,N) / N ? correct ?

yes - see the earlier illustration from @alpaycetin74

@alpaycetin74
Copy link
Collaborator

@bigludo7

and for PARTIAL result we use matchRate = Intersection (R,N) / N ? correct ?

yes - see the earlier illustration from @alpaycetin74

Hello, that was my opinion, but we had not reached a consensus back then. There is no description in the rc spec about calculating the matchRate yet.

@jlurien
Copy link
Collaborator Author

jlurien commented Nov 28, 2023

Yes, that formula is also what is proposed in the document attached to this issue (matchRate (%) = Intersection Area / Network Area). In the last meeting we concluded that it may not be perfect but it is the one with more consensus so far, so it is convenient to reflect this officially in order to have consistent implementations. If we design a better formula in the future we may adopt it.

@jlurien
Copy link
Collaborator Author

jlurien commented Nov 28, 2023

Thanks - yes, my point was that the (main branch) wording led to a different interpretation from the Guidelines.

It may help to have a formal declaration at the end of the YAML, after the 'plain language' definition, e.g.

Let R = the set of all possible locations within the Requested Area Let N = the set of all possible locations within the Network Area

The following conditions lead to the verification results stated:

Notation Description Verification result
R ⊆ N R is a subset of, or equal to, N TRUE
R ⊃ N R is a superset of N PARTIAL
R ∩ N and R ⊄ N R and N intersect but R is not a subset of N PARTIAL
R ∩ N = ∅ R and N are disjoint FALSE

Thanks. We recently rephrased the explanations in the spec, based on the suggestions by @alpaycetin74

@Marcus-MMJ
Copy link

Thanks - yes, my point was that the (main branch) wording led to a different interpretation from the Guidelines.
It may help to have a formal declaration at the end of the YAML, after the 'plain language' definition, e.g.
Let R = the set of all possible locations within the Requested Area Let N = the set of all possible locations within the Network Area
The following conditions lead to the verification results stated:
Notation Description Verification result
R ⊆ N R is a subset of, or equal to, N TRUE
R ⊃ N R is a superset of N PARTIAL
R ∩ N and R ⊄ N R and N intersect but R is not a subset of N PARTIAL
R ∩ N = ∅ R and N are disjoint FALSE

Thanks. We recently rephrased the explanations in the spec, based on the suggestions by @alpaycetin74

I think a formal description is really helpful.
But I think as already agreed in issue#20, the first two are exactly switched:

Notation Description Verification result
R ⊇ N R is a superset of, or equal to, N TRUE
R ⊂ N R is a subset of N PARTIAL
R ∩ N and N ⊄ R R and N intersect but N is not a subset of R PARTIAL
R ∩ N = ∅ R and N are disjoint FALSE

@Kevsy
Copy link
Collaborator

Kevsy commented Nov 30, 2023

@Marcus-MMJ Good catch! Thanks for the fix :)

@jlurien
Copy link
Collaborator Author

jlurien commented Nov 30, 2023

I think that the current phrasing in the PR consolidates all the discussion. Please take a final look so everything's fine. Thanks for the constructive feedback.

@maxl2287
Copy link
Contributor

maxl2287 commented May 31, 2024

In the case of:

  • R is a subset of N (R ⊂ N)

So that the Network is fully including the requested Area, what would than be the matchRate ? (based on (R ∩ N) / N * 100)
As of now it would then be:

{
 "verificationResult": "PARTIAL",
 "matchRate": 100,
 "lastLocationTime": "2023-09-07T10:40:52Z"
}

But that does not make really sense as the device can be located on the "other side" of the circle.
So you cannot guerentee a 100% matchRate.

wdyt @jlurien ?

@bigludo7
Copy link
Collaborator

But in this cas @maxl2287 at the question: Does the device is in the R area we could answer yes with a 100% certitude right? so for me matchRate=100 make sense.

@maxl2287
Copy link
Contributor

Let's say the network provides a huge radius (like a cell) where the device is located.
The verification-area is much smaller.

N = Location Area of the Device provided by the Network
V = requested verification - area
Red crosses = possible real location where the device is located

image

(1.) - here the device is located exactly in the verification area = 100%
(2.) & (3.) - It could also be that the device is located elsewhere inside the network-area but not exactle where the verification-area is. So how can we say that we have here a 100% match rate, when there is a possibility of not having the device in the verification-area?

@bigludo7
Copy link
Collaborator

Sorry @maxl2287 I'v mistunderstood

  • If the Requested Area is larger than the Network area where we "find" the device - and this Network Area is fully within the Requested Area here we answer true (meaning matchRate=100%)
  • If the Requested area is smaller than the network area - like you schema we have partial and the match rate will the union surface / N
    Back to your example, for me if you 're asking for V for devices 1, 2 and 3 and your network indicated that devices 1, 2, 3 are in the cell covering N you will have exactly the same result for the 3: PARTIAL and matchRate: 3% (my guess on surface V/surface N)

@jlurien
Copy link
Collaborator Author

jlurien commented Jun 3, 2024

It's as @bigludo7 explains. Regarding your original assumption, (R ∩ N) / N * 100), in this case if R is totally within N, then (R ∩ N) is R, so matchRate is ratio R / N, being R <<< N --> matchRate <<< 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

8 participants