Metamodel

A MetaModel is a platform that offers a common interface for exploring metadata, discovering and querying of different kind of data sources. In short, MetaModel is not data mapping framework, but yes, it allows you to emphasize abstraction of metadata. Moreover, it allows adding data source at a runtime, which makes it perfect for data processing applications.

With the help of MetaModel, you can get a stable query API and connector for Relational databases, MySQL, Oracle, Apache Hive, Embedded DB’s, Spreadsheets, Cassandra, Dynamic DB, SQL Server, PostgreSQL, CSV files and much more.

Problems

Apache MetaModel use by clients to abstract their DB access. No doubt in this that it works great and efficient for SQL DB (JDBC DB’s). But the problem that was faced by many clients is that it was not that much suitable for Cassandra, HBase and other NoSQL DBs. There was a need to create an abstraction that can allow some implementation specific choices that based on DB type.

There was another problem, and that is there was a need to create such code that at the end give such results that look similar as metamodel result set structure. Whether it is pure MetaModel (for RDBMS using JDBC) or whether it is custom no-SQL implementations it should look same.

The project can only be considered as successful only if we use CRUD functions for the demonstration for Postgres, Oracle, SQL Server, Cassandra and HBase with performance information indicated.

Need

The need is Cassandra and HBase need to be tested and populated with at least 100k rows to validate Read/Select performance. For having the solution of above problems, there was need of really good at Java development team that can handle this problem very quickly. Need to have more exposure to libraries like MetaModel. Moreover, can validate result from above requirements.

Design

Metadata abstraction function (beans) develop by us have CRUD functions along with any data context elements and table creation/dropping/modification. The design of coding recently drawn up by us having the same results (MM Dataset) matched with the MM supports. The beans developed by us named as Mesh CRUD or similarly the functions created by us that wrap or extend the MM functions.

Reads

This is the coding that we implemented within these functions. It helps us to avail the facility to get tenant_id from the text. A method that is performed to retrieve the tenant_id (as a stub) on custom context classes.

Well, this is something that is not developed by the developers because it just happens to bean/function.

It mostly depends on the licensing options that the customers choose. In short, they can only use certain DB connections not all the DB connections. Moreover, the connections are also limited to some list. Our routines check the DB connections type and then understand the need of the required acceptable DBs including SQL Server, MySQL, Oracle, SQL Server Express, HBase, Postgres, and MongoDB are the very initial things that are performed by us.

Solution

For the better solution of 2 databases that including the increased performance and the results according to the application was needed. We have applied R&D on the existing code for getting the performance credibility and efficiency. This is one of the main reason of changing in metamodel application. However, the SQL contexts other than tenant-id is much straight MM. In simple words, there is no need of NoSQL DBs for implementations.

Furthermore, the clients cannot use Phoenix driver for HBase and custom driver for Cassandra for filtering. So it is important to prove our performance. For this, we use 1 SQL DB (Postgres or My SQL) and HBase and Cassandra with 100k rows of data at least in it. With the help of these changes now we can perform stats with the help of CRUD quires against these functions. Its accounts for validation time and work with us on different sorts of selects for different keys.

The composite data context is another challenge in scope. So we handle this as well up to almost 2 connections. The Cassandra and HBase custom DB extensions are developed by using plug-in mentality. The meaning of this is we can use these plug-ins for Mongo, etc. Later without any rewriting of codes. This is something which is very much important.

Conclusion

The changes of R&D in the existing code has made MetaModel more useful for clients. Now there is perfect for Cassandra, HBase and other NoSQL DBs. The problem of results is also solved now. In the end, the results will be same as metamodel data set. No problem it is not an original Metamodel or a custom NoSQL implementation. The performance of Cassandra is much efficient now which is the success of this project.

Results

You can check the test results of the changes that have been made in the existing code for the better performance of Cassandra.

MM Cassandra K2A - dev performance test

Test-1

1. Observation taken for selecting all the coloumn from table: servicerequest

SQL - Query:

SELECT * from key2act.servicerequest;

Metamodel

Execution Time: 4682
Row Count: 39

CustomDataContext-Query;

CustomDc.query().from ("servicerequest").selectAll().
execute();

Our Implementation

Execution Time: 3668
Row Count: 39

CustomDataContext-Query;

CustomDc.query().from ("servicerequest").selectAll().
execute();

Target

Execution Time: 3867
Row Count: 39
SELECT * from key2act.servicerequest;

Metamodel

Execution Time: 4451
Row Count: 39

CustomDataContext-Query;

CustomDc.query().from ("servicerequest").selectAll().
execute();

Our Implementation

Execution Time: 4202
Row Count: 39

CustomDataContext-Query;

CustomDc.query().from ("servicerequest").selectAll().
execute();

Target

Execution Time: 4201
Row Count: 39

Test-2

1. Observation taken to get two coloumns with singles whrere condition clause from table:- Servicerequest

SQL - Query:

SELECT site,tenantid from key2act servicerequest where priority='URG' ALLOW FILTERING

Metamodel

Execution Time: 1237
Row Count: 39

CustomDataContext-Query:

customDc.query().
from("servicerequest") .select("site").and("tenantid").where ("priority").isEquals("URG").
execute();

Our Implementation

Execution Time: 1184
Row Count: 39

CustomDataContext-Query:

customDc.query().
from("servicerequest") .select("site").and("tenantid").where ("priority").isEquals("URG").
execute();

Target

Execution Time: 1231
Row Count: 39

SQL - Query:

SELECT site,tenantid from key2act servicerequest where priority='URG' ALLOW FILTERING

Metamodel

Execution Time: 1300
Row Count: 39

CustomDataContext-Query:

customDc.query().
from("servicerequest") .select("site").and("tenantid").where ("priority").isEquals("URG").
execute();

Our Implementation

Execution Time: 1165
Row Count: 39

CustomDataContext-Query:

customDc.query().
from("servicerequest") .select("site").and("tenantid").where ("priority").isEquals("URG").
execute();

Target

Execution Time: 1200
Row Count: 39