Development5 min read7.09.2018

What can’t NoSQL do – first experience with Google Cloud Datastore

Jakub Kluczewski

Java Developer

NoSQL Database Application – advice on the use of Google Cloud Datastore

Some years ago we faced a challenge of evolving our system of developing backend structures in a way that would allow them to handle both classic enterprise systems and mobile apps used by thousands of users.

Choosing a type of database that will operate “behind the scene” of a system is often key issue and the right choice is crucial to the project’s success. At ItCraft app development company, we chose to use non-relational solutions which resulted in creating a new NoSQL specialised branch for our Cloud technologies.

Entering the reality of non-relational architecture can be tricky for programmers used to working with classic, relational database systems so let’s look back to the beginnings and share some first steps advice based on Google cloud Datastore.

Not Only SQL

SQL query interface in Google Cloud Datastore is useful and natural as you don’t need to know the structure to write simple queries and get expected results.

Initial read of the documentation tells you queries are written in similar way to classic SQL. This can be deceiving – a programmer may think that his code will be no different to his regular solutions and the advantages are only underlying.

This view can put a developer in a difficult, if sometimes not impossible position.

Forget JOINs

“Mobile app shows list of articles, each containing authors name and profile picture.”

In relational solutions programmer sees the query to articles table using join of authors table. Programmer executing his code in Cloud Datastore from Google may even run a “classic” query like that. The code will not execute.

Datastore is a document database with indexed entities. There is no way for the same process to refer to multiple types of data.

Scalability comes with restrictions

The price for massive scalability of database read operations is restricting the queries that scale by size of returned data set. Joins can’t be executed, code filtering to more than one field will not work, filtering with sub queries is not possible.

What do we gain agreeing to these seemingly broad restrictions?

Mass query scalability means that performance is affected by the size of returned dataset, not the total amount of data in database. If we show several dozen articles in one page from our example, the execution time will be the same regardless of the table consisting of thousands or millions of articles.

Pagination – The farther you go…

Typically, pagination of queries is done by using OFFSET and LIMIT features. Although both features are supported by Google Cloud Datastore, using OFFSET in pagination is not a good idea.

Browsing pages of executed query will return correct results but will also cause the code execution to slow down with each page. The cause of this behaviour stems from the structure of data layout in the database. Querying for results starting from eg. 100th position using OFFSET causes searching through the whole index, including the first 100 positions.

In case of a database containing hundreds of thousands or millions of entities the need for reading data from the beginning of the table for each page makes this type of pagination impracticable.

Cursor technology to the rescue! The cursors indicate places in the index where the previous query ended. This complicates building classic paged table with page numbers, counters or displaying required page on demand, but is fully sufficient for common object lists in mobile apps.

“- Can we use JPA?
– Great idea!”

Not so much…

A programmer exploring new territories is often tempted to introduce unknown elements into known and familiar mechanisms.

Trying to import a NoSQL database into a relational framework is a good example of complicating things and making them unnecessarily time consuming. Although possible as well as somewhat enigmatically described in product documentation, it would not yield tangible benefits.

Configuration usually comes with multiple restrictions and “this will not work” “this will work differently” warnings. The “standard” we get this way is a creature that looks familiar at first, soon to bite you in the behind.

These few examples don’t quite exhaust the matter of differences between Datastore and “classic” relational approach . As with all new, unknown technologies for programmers an article will not replace actually trying to write code yourself. Familiarising yourself with some basic rules/differences will lower the entry threshold and speed up learning process.

ItCraft’s experience (a few years back) of introducing this branch of backend technologies taught us they can be tricky if not outright hard. It seems like a good idea give developers plenty of warning before they start kicking in an open door.

Jakub Kluczewski

Java Developer

Post article