Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fields.DateTime - load/dump to timestamp #612

Open
vgavro opened this issue Apr 16, 2017 · 21 comments
Open

fields.DateTime - load/dump to timestamp #612

vgavro opened this issue Apr 16, 2017 · 21 comments

Comments

@vgavro
Copy link
Contributor

vgavro commented Apr 16, 2017

I'm currently using this code to parse/dump timestamp, but I believe this should be in standard library. Should I create pull request, or have I missed another way to work with datetime/timestamp format?

class DateTime(ma.fields.DateTime):
    """
    Class extends marshmallow standart DateTime with "timestamp" format.
    """

    DATEFORMAT_SERIALIZATION_FUNCS = \
        ma.fields.DateTime.DATEFORMAT_SERIALIZATION_FUNCS.copy()
    DATEFORMAT_DESERIALIZATION_FUNCS = \
        ma.fields.DateTime.DATEFORMAT_DESERIALIZATION_FUNCS.copy()

    DATEFORMAT_SERIALIZATION_FUNCS['timestamp'] = lambda x: x.timestamp()
    DATEFORMAT_DESERIALIZATION_FUNCS['timestamp'] = datetime.fromtimestamp
@vgavro vgavro changed the title DateTime - proper load/dump to timestamp float/integer? fields.DateTime - load/dump to timestamp Apr 16, 2017
@sloria
Copy link
Member

sloria commented Apr 16, 2017

@vgavro I think this is a fair use case. I anticipate that users will typically want to deserialize to timezone-aware datetimes, so the API should make this simple. I haven't given to much thought to what it would look like though. Maybe it would make sense as a separate field that receives a tzinfo param,

my_timestamp = fields.Timestamp(tzinfo=pytz.utc)

@disconnect3d
Copy link

Bump, this would be nice to have 👍

@stephan-rayner
Copy link

Is there a plan to do this thing? I can't tell you how happy this would make me?

@disconnect3d
Copy link

@stephan-rayner You can do it even better: work on that and send a PR? :)

Note that you can also attend in https://hacktoberfest.digitalocean.com and win a t-shirt ;D.

@sloria
Copy link
Member

sloria commented Oct 5, 2018

I would gladly review and merge a PR adding a Timestamp field.

@vgavro
Copy link
Contributor Author

vgavro commented Oct 9, 2018

I will gladly work on this PR, but maybe it's better to implement this as custom format for AwareDateTime and NaiveDateTime instead of separate field #933 ? Like I see it's good decision to force conversion to aware datetime, but still explicitly allow naive datetimes.

@disconnect3d
Copy link

but still explicitly allow naive datetimes

explicitly is the key here. If the user wants it, let him do that.

I am not a maintainer of the project but I am up for this as we wouldn't make the bad thing (naive datetimes) to be the default ones.

@sloria
Copy link
Member

sloria commented Oct 12, 2018

Ah, yes, I see your @vgavro . Yes, I suppose it could be a new datetime format. I'm fine with that.

@lig
Copy link

lig commented Oct 12, 2018

@sloria I've tried this as a format for fields.DateTime and it is not quite right. DateTime field produces the value of type str. However, a timestamp has to have type float. Also, there are some edge cases (python type → serialized from):

  • float → float (obviously, it could be done via the Float field but it should be positive and it will be handy to have something like from_milliseconds for JavaScript compatibility)
  • float → datetime string (rare case but fair)
  • datetime or float → float
  • datetime or float → datetime string

My idea is that supporting timestamps is a bit more than formatting datetime to float.

@vgavro
Copy link
Contributor Author

vgavro commented Oct 15, 2018

@lig As you can see in implementation of DateTime, it is not forcing str or float, so I don't get the problem you stated, also serialization in one format and deserialization in other not a matter of specific field responsibility (you can use different fields for that). But I really agree with your from_milliseconds usecase.

@sloria @lafrech Here are scenarios with timestamps (+ timezone aware/naive datetimes) we want to cover, I would really love to have feedback before I can work on this, if I haven't anything missed (I hope?)

[1]. By definition timestamp (POSIX time or UNIX Epoch time) should be in UTC.
[2]. Anyway we may see external systems that is using timestamp in timezone-related manner, on working with such systems we should provide timezone-aware datetime with explicitly set timezone to replace/conversion.
[3]. Also we may have python systems that is internally using timezone-naive datetimes, asserting all datetimes are using same timezone (local or utc, for example).

So generally timestamp adding because of [1] can be solved:
Adding timestamp and timestamp_ms to DateTime.format.
By default all timestamps must be converted to utc by default, and we can add this to DateTime formats without extra logic for timezone conversion, it will be fair use.
timestamp_ms stands for "milliseconds" (as stated in wikipedia, where milliseconds marked as "ms", and microseconds marked as "μs", so it's good name to be compatible with javascript-related projects, where timestamp is Number in milliseconds - for example moment.now() and Number(new Date()) returns milliseconds Number - good point from @lig).

As for [2],[3] - one option may be to interpret timestamp timezones (or other datetimes) in custom ways (replace/drop/convert tzinfo) in subclasses, that willl inherit DateTime, but not required to be in it's core functionality. This way problem is partly trying to solve #933 , but in case with timestamp we want to explicitly replace timezone, not just default it (because timestamp default is UTC, like we see in [1]). I think this functionality can be in base DateTime field by extending it with timezone=None, timezone_naive=False, this can be solved in backward-compatible way.

The [2] problem may be solved:
adding timezone=None to DateTime field, to force timestamps + timezone-naive strings in explicitly set timezone. If self.timezone - on deserialization timezone will be replaced for datetime-naive strings and timestamps, on serialization datetime will be converted to self.timezone for timezone-aware (only for timestamps, not to have different input/output for timezone aware strings) or replaced for timezone-naive datetimes, This will also cover AwareDateTime use-case from #933 , which is also enforces only timezone-naive datetimes. timezone will be set to None by default (for backward-compatibility), in this case timestamps always will be in UTC (because of [1]), and other formats can return timezone-naive datetimes. Or we may default it to UTC (which breaks backward-compatibility and does not allow parsing timezone-naive datetimes, which may be interpreted as special use-case by user code for example, but maybe for good reason).

The [3] problem may be solved:
adding timezone_naive=False, which may:
drops tzinfo on deserialization (ony after conversion to timezone from proposition above, if self.timezone is not None), this also will cover NaiveDateTime use-case from #933 . If we have timezone=None by default on serialization timezone-naive datetimes will be anyway converted to UTC (as in current implementation for iso/rfc), and we have consistent behaviour with timestamp also in this case. I know there is discussion around serializing NaiveDateTime to timezone-naive strings, this can be achieved with format_options proposed here #889 (comment)

Cons on this:

P.S. One more use-case that personally I see not fits in conception "timestamp is just another datetime format" is interpreting 0 as null for timestamps, but it's too easy to implement in user space and really is just bad api design that I've seen - but I'm sure we don't need it in marshmallow code anyway.

I really think this will fix timezones/timestamps issues once for all (at last), sorry for tagging (I know it's hard to give time for such big open source project), and thanks in advance for your feedback. I can use tests and utils.str2tz from #933 if you think this is ok to go.

@lig
Copy link

lig commented Oct 16, 2018

@vgavro my concern regarding the type of serialization result fields.DateTime provides is a result of a common practice for plugins and third party libraries to determine a type of field basing on the schema Marshmallow provides. Usually, it is done by looking to the field instance type.

For example, OpenAPI plugin produces openapi spec {type: string, format: datetime} for fields.DateTime. However, for the timestamp case it should be {type: number, format: float}. So, by the nature of the schema itself, it looks like the timestamp representation of a datetime value should be a fields.Float descendant rather than fields.DateTime descendant.

@vgavro
Copy link
Contributor Author

vgavro commented Oct 17, 2018

timestamps as fields.DateTime.format, rather than fields.Timestamp:
cons:

  • openapi utilities should determine not only by instance type, but by instance.format
  • we have different behavior for timezone-aware functionality (more in my comment above)

pros:

  • we're not confusing marshmallow users with separate field name (we have DateTime(format) field, which means datetime in python code, not IsoDateTime, RfcDateTime or StringDateTime fields)
  • we have same behaviour for timezone-naive functionality

I think it's very important to have consistent behaviour about timezone-aware/naive things between datetime fields, and I don't see how separate AwareDateTime and NaiveDateTime fields are going to solve this with fields.Timestamp. NaiveTimestamp will be ok, or we should extend fields.DateTime with timezone and timezone_naive options anyway, and maybe move this functionality to mixins, or there are other options?

@lafrech
Copy link
Member

lafrech commented Oct 17, 2018

Quick answer @vgavro as I don't have time to look at it right now.

  • There's been discussions about having several fields for the same python object and different serialization type/format. It originated from questions about the strict parameter. @deckar01 proposed (in fix: change how strict works on fields.Int() so it's really strict about types #755 (comment)) to have IntegerString, IntegerNumber, for instance. This to simplify complicated parametrization and simplify interface. Now I realize it would also make things easier for apispec. The whole rework won't be in marshmallow 3 but we can have this in mind for the DateTime rework.

  • You may ignore my AwareDateTime and NaiveDateTime PR. I need to rethink the whole thing from scratch, I just didn't get the time, and it is likely to change completely.

  • I don't like the idea that a Field can behave differently whether or not an optional dependency (here, dateutils) is installed. I first thought we could be able to dump it completely but that was "naive", as a lib is needed for all TZ operations. From what I read, and from Python docs recommendation, dateutil is recommended over pytz, so let's keep it for TZ stuff, but it would be nice to require an explicit intention from the user, that is only do TZ work with dateutil on fields that require it either due to their name (AwareDateTime) or a parameter (aware=True, whatever). But avoid just silently strip TZ info if it is not installed. Besides, the doc is not that helpful, saying dateutil provides better deserialization, not mentioning the difference is about TZ stuff.

@lig
Copy link

lig commented Oct 17, 2018

@vgavro another case to mention is about possible value ranges. The thing is DateTime could be whatever in the past. However, only values from 1970-01-01 are possible to serialize as a timestamp.

@vgavro
Copy link
Contributor Author

vgavro commented Oct 17, 2018

@lig that affects nothing, because timestamps can be negative.

In [3]: datetime.utcfromtimestamp(-1)
Out[3]: datetime.datetime(1969, 12, 31, 23, 59, 59)

@vgavro
Copy link
Contributor Author

vgavro commented Oct 17, 2018

@lafrech I just wanted to solve this aware/naive functionality in place, because default timezone and explicitly converted naive datetime is really widely-used case. My first proposal exactly was rethinking your PR also, but like I see there is a lot of discussion needed and it's better be solved in separate (with timestamp) ways, I really concerned about timezone-aware/naive DateTime field, but like I see now it's better not to mix this decisions.
About third - I don't see how this affects with/without dateutil problem, I think they are not related (they both timezone related of course, but they are not cross-related from my point of view).
Anyway, thanks for feedback, I think it's good idea about separate field only because of different timezone functionality and more clean code, so we can move on this. Please review new PR when you have time, I'll cancel previous as soon as one will be approved in some way) Thanks!

@lafrech
Copy link
Member

lafrech commented Jul 17, 2019

I didn't go through the whole thread again but I'm pretty sure this feature has become much easier to implement thanks to the rework in #1278.

@plantusd
Copy link

any updates on fields.Timestamp?

@vgavro
Copy link
Contributor Author

vgavro commented Oct 31, 2019

any updates on fields.Timestamp?
Currently we wait for maintainers decision on this:
#1003 (comment)

@james-willcox
Copy link

Bump, would love to have this

@dingyaguang117
Copy link

dingyaguang117 commented Feb 7, 2024

I'm currently using this code to parse/dump timestamp, but I believe this should be in standard library. Should I create pull request, or have I missed another way to work with datetime/timestamp format?

class DateTime(ma.fields.DateTime):
    """
    Class extends marshmallow standart DateTime with "timestamp" format.
    """

    DATEFORMAT_SERIALIZATION_FUNCS = \
        ma.fields.DateTime.DATEFORMAT_SERIALIZATION_FUNCS.copy()
    DATEFORMAT_DESERIALIZATION_FUNCS = \
        ma.fields.DateTime.DATEFORMAT_DESERIALIZATION_FUNCS.copy()

    DATEFORMAT_SERIALIZATION_FUNCS['timestamp'] = lambda x: x.timestamp()
    DATEFORMAT_DESERIALIZATION_FUNCS['timestamp'] = datetime.fromtimestamp

It's a very goog idea. It's a very common case, the datetime object from sqlalchemy is without timezone. At most cases, the right tz is system tz.

But maybe reanme to LocalDatetTime?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants