The need for Snowflake comes as part of Twitter's move from MySQL to Cassandra, as the latter has no built-in way of generating unique IDs. These Tweet IDs are unique 64bit unsigned integers. Currently IDs are assigned sequentially, based on time. The full ID is composed of a timestamp, a worker number, and a sequence number.
Snowflake Problems and Solutions
In order to allow these IDs to be read, Twitter has provided a string version of any ID when responding in the JSON format. This means that Status, User, Direct Message, and Saved Search IDs in the main Twitter API, the Streaming API and the Search API will be returned as an integer and a string in JSON responses.
Twitter has pushed the timeline back for the the Snowflake launch (string versions of ID numbers will start appearing in API responses on Friday). But in the meantime, here are the steps Twitter advises developers take "RIGHT NOW"
You need to decode the JSON snippet given in the announcement, observe the output then:
- If your code converts the ID successfully without losing accuracy you are OK but should consider converting to the _str versions of IDs as soon as possible.
- If your code has lost accuracy, convert your code to using the _str version immediately. If you do not do this your code will be unable to interact with the Twitter API reliably.
- In some language parsers, the JSON may throw an exception when reading the ID value. If this happens in your parser you will need to 'pre-parse' the data, removing or replacing ID parameters with their _str versions.
November 26 is now the target date for Status IDs that will exceed 53bits.