JSONKit is licensed under the terms of the BSD License. Copyright © 2011, John Engelhart.
Parsing | Serializing |
---|---|
23% Faster than Binary .plist ! | 549% Faster than Binary .plist ! |
- Benchmarking was performed on a MacBook Pro with a 2.66GHz Core 2.
- All JSON libraries were compiled with
gcc-4.2 -DNS_BLOCK_ASSERTIONS -O3 -arch x86_64
. - Timing results are the average of 1,000 iterations of the user + system time reported by
getrusage
. - The JSON used was
twitter_public_timeline.json
from samsoffes / json-benchmarks. - Since the
.plist
format does not support serializingNSNull
, thenull
values in the original JSON were changed to"null"
. - The experimental branch contains the
gzip
compression changes.- JSONKit automagically links to
libz.dylib
on the fly at run time– no manual linking required. - Parsing / deserializing will automagically decompress a buffer if it detects a
gzip
signature header. - You can compress /
gzip
the serialized JSON by passingJKSerializeOptionCompress
to-JSONDataWithOptions:error:
.
- JSONKit automagically links to
JavaScript Object Notation, or JSON, is a lightweight, text-based, serialization format for structured data that is used by many web-based services and API's. It is defined by RFC 4627.
JSON provides the following primitive types:
null
- Boolean
true
andfalse
- Number
- String
- Array
- Object (a.k.a. Associative Arrays, Key / Value Hash Tables, Maps, Dictionaries, etc.)
These primitive types are mapped to the following Objective-C Foundation classes:
JSON | Objective-C |
---|---|
null | NSNull |
true and false | NSNumber |
Number | NSNumber |
String | NSString |
Array | NSArray |
Object | NSDictionary |
JSONKit uses Core Foundation internally, and it is assumed that Core Foundation ≡ Foundation for every equivalent base type, i.e. CFString
≡ NSString
.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
-
The JSON specification is somewhat ambiguous about the details and requirements when it comes to Unicode, and it does not specify how Unicode issues and errors should be handled. Most of the ambiguity stems from the interpretation and scope RFC 4627 Section 3, Encoding:
JSON text SHALL be encoded in Unicode.
It is the authors opinion and interpretation that the language of RFC 4627 requires that a JSON implementation MUST follow the requirements specified in the Unicode Standard, and in particular the Conformance chapter of the Unicode Standard, which specifies requirements related to handling, interpreting, and manipulating Unicode text.The default behavior for JSONKit is strict RFC 4627 conformance. It is the authors opinion and interpretation that RFC 4627 requires JSON to be encoded in Unicode, and therefore JSON that is not legal Unicode as defined by the Unicode Standard is invalid JSON. Therefore, JSONKit will not accept JSON that contains ill-formed Unicode. The default, strict behavior implies that the
JKParseOptionLooseUnicode
option is not enabled.When the
JKParseOptionLooseUnicode
option is enabled, JSONKit follows the specifications and recommendations given in The Unicode 6.0 standard, Chapter 3 - Conformance, section 3.9 Unicode Encoding Forms. As a general rule of thumb, the Unicode code pointU+FFFD
is substituted for any ill-formed Unicode encountered. JSONKit attempts to follow the recommended Best Practice for Using U+FFFD: Replace each maximal subpart of an ill-formed subsequence by a single U+FFFD.The following Unicode code points are treated as ill-formed Unicode, and if
JKParseOptionLooseUnicode
is enabled, causeU+FFFD
to be substituted in their place:U+0000
.
U+D800
thruU+DFFF
, inclusive.
U+FDD0
thruU+FDEF
, inclusive.
U+nFFFE
andU+nFFFF
, where n is from0x0
to0x10
The code points
U+FDD0
thruU+FDEF
,U+nFFFE
, andU+nFFFF
(where n is from0x0
to0x10
), are defined as Noncharacters by the Unicode standard and "should never be interchanged".An exception is made for the code point
U+0000
, which is legal Unicode. The reason for this is that this particular code point is used by C string handling code to specify the end of the string, and any such string handling code will incorrectly stop processing a string at the point whereU+0000
occurs. Although reasonable people may have different opinions on this point, it is the authors considered opinion that the risks of permitting JSON Strings that containU+0000
outweigh the benefits. One of the risks in allowingU+0000
to appear unaltered in a string is that it has the potential to create security problems by subtly altering the semantics of the string which can then be exploited by a malicious attacker. This is similar to the issue of arbitrarily deleting characters from Unicode text.RFC 4627 allows for these limitations under section 4, Parsers:
An implementation may set limits on the length and character contents of strings.
While the Unicode Standard permits the mutation of the original JSON (i.e., substitutingU+FFFD
for ill-formed Unicode), RFC 4627 is silent on this issue. It is the authors opinion and interpretation that RFC 4627, section 3 – Encoding,JSON text SHALL be encoded in Unicode.
implies that such mutations are not only permitted but MUST be expected by any strictly conforming RFC 4627 JSON implementation. The author feels obligated to note that this opinion and interpretation may not be shared by others and, in fact, may be a minority opinion and interpretation. You should be aware that any mutation of the original JSON may subtly alter its semantics and, as a result, may have security related implications for anything that consumes the final result.It is important to note that JSONKit will not delete characters from the JSON being parsed as this is a requirement specified by the Unicode Standard. Additional information can be found in the Unicode Security FAQ and Unicode Technical Report #36 - Unicode Security Consideration, in particular the section on non-visual security.
-
The JSON specification does not specify the details or requirements for JSON String values, other than such strings must consist of Unicode code points, nor does it specify how errors should be handled. While JSONKit makes an effort (subject to the reasonable caveats above regarding Unicode) to preserve the parsed JSON String exactly, it can not guarantee that
NSString
will preserve the exact Unicode semantics of the original JSON String.JSONKit does not perform any form of Unicode Normalization on the parsed JSON Strings, but can not make any guarantees that the
NSString
class will not perform Unicode Normalization on the parsed JSON String used to instantiate theNSString
object. TheNSString
class may place additional restrictions or otherwise transform the JSON String in such a way so that the JSON String is not bijective with the instantiatedNSString
object. In other words, JSONKit can not guarantee that when you round trip a JSON String to aNSString
and then back to a JSON String that the two JSON Strings will be exactly the same, even though in practice they are. For clarity, "exactly" in this case means bit for bit identical. JSONKit can not even guarantee that the two JSON Strings will be Unicode equivalent, even though in practice they will be and would be the most likely cause for the two round tripped JSON Strings to no longer be bit for bit identical. -
JSONKit maps
true
andfalse
to theCFBoolean
valueskCFBooleanTrue
andkCFBooleanFalse
, respectively. Conceptually,CFBoolean
values can be thought of, and treated as,NSNumber
class objects. The benefit to usingCFBoolean
is thattrue
andfalse
JSON values can be round trip deserialized and serialized without conversion or promotion to aNSNumber
with a value of0
or1
. -
The JSON specification does not specify the details or requirements for JSON Number values, nor does it specify how errors due to conversion should be handled. In general, JSONKit will not accept JSON that contains JSON Number values that it can not convert with out error or loss of precision.
For non-floating-point numbers (i.e., JSON Number values that do not include a
.
ore|E
), JSONKit uses a 64-bit C primitive type internally, regardless of whether the target architecture is 32-bit or 64-bit. For unsigned values (i.e., those that do not begin with a-
), this allows for values up to264-1
and up to-263
for negative values. As a special case, the JSON Number-0
is treated as a floating-point number since the underlying floating-point primitive type is capable of representing a negative zero, whereas the underlying twos-complement non-floating-point primitive type can not. JSON that contains Number values that exceed these limits will fail to parse and optionally return aNSError
object. The functionsstrtoll()
andstrtoull()
are used to perform the conversions.The C
double
primitive type, or IEEE 754 Double 64-bit floating-point, is used to represent floating-point JSON Number values. JSON that contains floating-point Number values that can not be represented as adouble
(i.e., due to over or underflow) will fail to parse and optionally return aNSError
object. The functionstrtod()
is used to perform the conversion. Note that the JSON standard does not allow for infinities orNaN
(Not a Number). The conversion and manipulation of floating-point values is non-trivial. Unfortunately, RFC 4627 is silent on how such details should be handled. You should not depend on or expect that when a floating-point value is round tripped that it will have the same textual representation or even compare equal. This is true even when JSONKit is used as both the parser and creator of the JSON, let alone when transferring JSON between different systems and implementations. -
For JSON Objects (or
NSDictionary
in JSONKit nomenclature), RFC 4627 saysThe names within an object SHOULD be unique
(note:name
is akey
in JSONKit nomenclature). At this time the JSONKit behavior isundefined
for JSON that contains names within an object that are not unique. However, JSONKit currently tries to follow a "the last key / value pair parsed is the one chosen" policy. This behavior is not finalized and should not be depended on.The previously covered limitations regarding JSON Strings have important consequences for JSON Objects since JSON Strings are used as the
key
. The JSON specification does not specify the details or requirements for JSON Strings used askeys
in JSON Objects, specifically what it means for twokeys
to compare equal. Unfortunately, because RFC 4627 statesJSON text SHALL be encoded in Unicode.
, this means that one must use the Unicode Standard to interpret the JSON, and the Unicode Standard allows for strings that are encoded with different Unicode Code Points to "compare equal". JSONKit usesNSString
exclusively to manage the parsed JSON Strings. BecauseNSString
uses Unicode as its basis, there exists the possibility thatNSString
may subtly and silently convert the Unicode Code Points contained in the original JSON String in to a Unicode equivalent string. Due to this, the JSONKit behavior for JSON Strings used askeys
in JSON Objects that may be Unicode equivalent but not binary equivalent isundefined
.
-
JSONKit will fail with an error if it encounters an object that is not a
NSNull
,NSNumber
,NSString
,NSArray
, orNSDictionary
class object during serialization. More specifically, JSONKit will fail with an error if it encounters an object where-isKindOfClass:
returnsNO
for all of the previously mentioned classes. -
The
NSDictionary
class allows for any object, which can be of any class, to be used as akey
. JSON, however, only permits Strings to be used askeys
. Therefore JSONKit will fail with an error if it encounters aNSDictionary
that contains keys that are notNSString
objects during serialization. More specifically, the keys must returnYES
when sent-isKindOfClass:[NSString class]
. -
JSON does not allow for Numbers that are
±Infinity
or±NaN
. Therefore JSONKit will fail with an error if it encounters aNSNumber
that contains such a value during serialization. -
Objects created with
[NSNumber numberWithBool:YES]
and[NSNumber numberWithBool:NO]
will be mapped to the JSON values oftrue
andfalse
, respectively. More specifically, an object must have pointer equality withkCFBooleanTrue
orkCFBooleanFalse
(i.e.,if((id)object == (id)kCFBooleanTrue)
) in order to be mapped to the JSON valuestrue
andfalse
. -
NSNumber
objects that are not booleans (as defined above) are sent-objCType
to determine what type of C primitive type they represent. Those that respond with a type from the setcislq
are treated aslong long
, those that respond with a type from the setCISLQB
are treated asunsigned long long
, and those that respond with a type from the setfd
are treated asdouble
.NSNumber
objects that respond with a type not in the set of types mentioned will cause JSONKit to fail with an error.More specifically,
CFNumberGetValue(object, kCFNumberLongLongType, &longLong)
is used to retrieve the value of both the signed and unsigned integer values, and the type reported by-objCType
is used to determine whether the result is signed or unsigned. For floating-point values,CFNumberGetValue
is used to retrieve the value usingkCFNumberDoubleType
for the type argument.Floating-point numbers are converted to their decimal representation using the
printf
format conversion specifier%.17g
. Theoretically this allows up to afloat
, or IEEE 754 Single 32-bit floating-point, worth of precision to be represented. This means that for practical purposes,double
values are converted tofloat
values with the associated loss of precision. The RFC 4627 standard is silent on how floating-point numbers should be dealt with and the author has found that real world JSON implementations vary wildly in how they handle this issue. Furthermore, the%g
format conversion specifier may convert floating-point values that can be exactly represented as an integer to a textual representation that does not include a.
ore
– essentially silently promoting a floating-point value to an integer value (i.e,5.0
becomes5
). Because of these and many other issues surrounding the conversion and manipulation of floating-point values, you should not expect or depend on floating point values to maintain their full precision, or when round tripped, to compare equal. -
When serializing, the top level container, and all of its children, are required to be strictly invariant during enumeration. This property is used to make certain optimizations, such as if a particular object has already been serialized, the result of the previous serialization can be reused (i.e., the text of the previous serialization can simply be copied instead of performing all the serialization work again). While this is probably only applicable to those who are doing extraordinarily unusual things with the run-time or custom classes inheriting from the classes that JSONKit will serialize, it does cover the cases where any of the objects are mutated during enumeration (i.e., mutated by another thread). The behavior is
undefined
if this requirement is violated.
Please use the github.com JSONKit Issue Tracker to report bugs.
The author requests that you do not file a bug report with JSONKit regarding problems reported by the clang
static analyzer unless you first manually verify that it is an actual, bona-fide problem with JSONKit and, if appropriate, is not "legal" C code as defined by the C99 language specification. If the clang
static analyzer is reporting a problem with JSONKit that is not an actual, bona-fide problem and is perfectly legal code as defined by the C99 language specification, then the appropriate place to file a bug report or complaint is with the developers of the clang
static analyzer.
-
JSONKit is not designed to be used with the Mac OS X Garbage Collection. The behavior of JSONKit when compiled with
-fobj-gc
isundefined
. It is extremely unlikely that Mac OS X Garbage Collection will ever be supported. -
The JSON to be parsed by JSONKit MUST be encoded as Unicode. In the unlikely event you end up with JSON that is not encoded as Unicode, you must first convert the JSON to Unicode, preferably as
UTF8
. One way to accomplish this is with theNSString
methods-initWithBytes:length:encoding:
and-initWithData:encoding:
. -
Internally, the low level parsing engine uses
UTF8
exclusively. TheJSONDecoder
method-objectWithData:
takes aNSData
object as its argument and it is assumed that the raw bytes contained in theNSData
isUTF8
encoded, otherwise the behavior isundefined
. -
It is not safe to use the same instantiated
JSONDecoder
object from multiple threads at the same time. If you wish to share aJSONDecoder
between threads, you are responsible for adding mutex barriers to ensure that only one thread is decoding JSON using the sharedJSONDecoder
object at a time.
-
Enable the
NS_BLOCK_ASSERTIONS
pre-processor flag. JSONKit makes heavy use ofNSCParameterAssert()
internally to ensure that various arguments, variables, and other state contains only legal and expected values. If an assertion check fails it causes a run time exception that will normally cause a program to terminate. These checks and assertions come with a price: they take time to execute and do not contribute to the work being performed. It is perfectly safe to enableNS_BLOCK_ASSERTIONS
as JSONKit always performs checks that are required for correct operation. The checks performed withNSCParameterAssert()
are completely optional and are meant to be used in "debug" builds where extra integrity checks are usually desired. While your mileage may vary, the author has found that adding-DNS_BLOCK_ASSERTIONS
to an-O2
optimization setting can generally result in an approximate 7-12% increase in performance. -
Since the very low level parsing engine works exclusively with
UTF8
byte streams, anything that is not already encoded asUTF8
must first be converted toUTF8
. While JSONKit provides additions to theNSString
class which allows you to conveniently convert JSON contained in aNSString
, this convenience does come with a price. JSONKit must allocate an autoreleasedNSMutableData
large enough to hold the stringsUTF8
conversion and then convert the string toUTF8
before it can begin parsing. This takes both memory and time. Once the parsing has finished, JSONKit sets the autoreleasedNSMutableData
length to0
, which allows theNSMutableData
to release the memory. This helps to minimize the amount of memory that is in use but unavailable until the autorelease pool pops. Therefore, if speed and memory usage are a priority, you should avoid using theNSString
convenience methods if possible. -
If you are receiving JSON data from a web server, and you are able to determine that the raw bytes returned by the web server is JSON encoded as
UTF8
, you should use theJSONDecoder
method-objectWithUTF8String:length:
which immediately begins parsing the pointers bytes. In practice, every JSONKit method that converts JSON to an Objective-C object eventually calls this method to perform the conversion. -
If you are using one of the various ways provided by the
NSURL
family of classes to receive JSON results from a web server, which typically return the results in the form of aNSData
object, and you are able to determine that the raw bytes contained in theNSData
are encoded asUTF8
, then you should use either theJSONDecoder
methodobjectWithData:
or theNSData
method-objectFromJSONData
. If are going to be converting a lot of JSON, the better choice is to instantiate aJSONDecoder
object once and use the same instantiated object to perform all your conversions. This has two benefits:- The
NSData
method-objectFromJSONData
creates an autoreleasedJSONDecoder
object to perform the one time conversion. By instantiating aJSONDecoder
object once and using theobjectWithData:
method repeatedly, you can avoid this overhead. - The instantiated object cache from the previous JSON conversion is reused. This can result in both better performance and a reduction in memory usage if the JSON your are converting is very similar. A typical example might be if you are converting JSON at periodic intervals that consists of real time status updates.
- The
-
On average, the
JSONData…
methods are nearly four times faster than theJSONString…
methods when serializing aNSDictionary
orNSArray
to JSON. The difference in speed is due entirely to the instantiation overhead ofNSString
.
The objectWith…
methods return immutable collection objects and their respective mutableObjectWith…
methods return mutable collection objects.
Note: The bytes contained in a NSData
object MUST be UTF8
encoded.
Important: Methods will raise NSInvalidArgumentException
if parseOptionFlags
is not valid.
Important: objectWithUTF8String:
and mutableObjectWithUTF8String:
will raise NSInvalidArgumentException
if string
is NULL
.
Important: objectWithData:
and mutableObjectWithData:
will raise NSInvalidArgumentException
if jsonData
is NULL
.
+ (id)decoder; + (id)decoderWithParseOptions:(JKParseOptionFlags)parseOptionFlags; - (id)initWithParseOptions:(JKParseOptionFlags)parseOptionFlags; - (void)clearCache; - (id)objectWithUTF8String:(const unsigned char *)string length:(size_t)length; - (id)objectWithUTF8String:(const unsigned char *)string length:(size_t)length error:(NSError **)error; - (id)mutableObjectWithUTF8String:(const unsigned char *)string length:(size_t)length; - (id)mutableObjectWithUTF8String:(const unsigned char *)string length:(size_t)length error:(NSError **)error; - (id)objectWithData:(NSData *)jsonData; - (id)objectWithData:(NSData *)jsonData error:(NSError **)error; - (id)mutableObjectWithData:(NSData *)jsonData; - (id)mutableObjectWithData:(NSData *)jsonData error:(NSError **)error;
- (id)objectFromJSONString; - (id)objectFromJSONStringWithParseOptions:(JKParseOptionFlags)parseOptionFlags; - (id)objectFromJSONStringWithParseOptions:(JKParseOptionFlags)parseOptionFlags error:(NSError **)error; - (id)mutableObjectFromJSONString; - (id)mutableObjectFromJSONStringWithParseOptions:(JKParseOptionFlags)parseOptionFlags; - (id)mutableObjectFromJSONStringWithParseOptions:(JKParseOptionFlags)parseOptionFlags error:(NSError **)error;
- (id)objectFromJSONData; - (id)objectFromJSONDataWithParseOptions:(JKParseOptionFlags)parseOptionFlags; - (id)objectFromJSONDataWithParseOptions:(JKParseOptionFlags)parseOptionFlags error:(NSError **)error; - (id)mutableObjectFromJSONData; - (id)mutableObjectFromJSONDataWithParseOptions:(JKParseOptionFlags)parseOptionFlags; - (id)mutableObjectFromJSONDataWithParseOptions:(JKParseOptionFlags)parseOptionFlags error:(NSError **)error;
Parsing Option | Description |
---|---|
JKParseOptionNone | This is the default if no other other parse option flags are specified, and the option used when a convenience method does not provide an argument for explicitly specifying the parse options to use. Synonymous with JKParseOptionStrict . |
JKParseOptionStrict | The JSON will be parsed in strict accordance with the RFC 4627 specification. |
JKParseOptionComments | Allow C style // and /* … */ comments in JSON. This is a fairly common extension to JSON, but JSON that contains C style comments is not strictly conforming JSON. |
JKParseOptionUnicodeNewlines | Allow Unicode recommended (?:\r\n|[\n\v\f\r\x85\p{Zl}\p{Zp}]) newlines in JSON. The JSON specification only allows the newline characters \r and \n , but this option allows JSON that contains the Unicode recommended newline characters to be parsed. JSON that contains these additional newline characters is not strictly conforming JSON. |
JKParseOptionLooseUnicode | Normally the decoder will stop with an error at any malformed Unicode. This option allows JSON with malformed Unicode to be parsed without reporting an error. Any malformed Unicode is replaced with \uFFFD , or REPLACEMENT CHARACTER , as specified in The Unicode 6.0 standard, Chapter 3, section 3.9 Unicode Encoding Forms. |
JKParseOptionPermitTextAfterValidJSON | Normally, white-space that follows the JSON is interpreted as a parsing failure. This option allows for any trailing white-space to be ignored and not cause a parsing error. |
Note: The bytes contained in the returned NSData
object is UTF8
encoded.
- (NSData *)JSONData; - (NSData *)JSONDataWithOptions:(JKSerializeOptionFlags)serializeOptions error:(NSError **)error; - (NSString *)JSONString; - (NSString *)JSONStringWithOptions:(JKSerializeOptionFlags)serializeOptions error:(NSError **)error;
Serializing Option | Description |
---|---|
JKSerializeOptionNone | This is the default if no other other serialize option flags are specified, and the option used when a convenience method does not provide an argument for explicitly specifying the serialize options to use. |
JKSerializeOptionPretty | Normally the serialized JSON does not include any unnecessary white-space . While this form is the most compact, the lack of any white-space means that it's something only another JSON parser could love. Enabling this option causes JSONKit to add additional white-space that makes it easier for people to read. Other than the extra white-space , the serialized JSON is identical to the JSON that would have been produced had this option not been enabled. |
JKSerializeOptionEscapeUnicode | When JSONKit encounters Unicode characters in NSString objects, the default behavior is to encode those Unicode characters as UTF8 . This option causes JSONKit to encode those characters as \uXXXX . For example,["w∈L⟺y(∣y∣≤∣w∣)"] becomes: ["w\u2208L\u27fa\u2203y(\u2223y\u2223\u2264\u2223w\u2223)"] |