Thursday, 24 May 2012

MARKLOGIC: How to use MLSQL Xquery Library for Marklogic


MLSQL: An XQuery Library for Relational Database Access
From Jason Hunter

Introduction

MLSQL is an open source XQuery library (written by Jason Hunter of MarkLogic and Ryan Grimm of O'Reilly Media) that allows easy access to relational database systems from within the MarkLogic environment. MLSQL lets you execute arbitrary SQL commands against any relational database (MySQL, Oracle, DB2, etc) and it captures the results as XML for processing within the MarkLogic environment. Using the library enables XQuery applications to leverage a relational database without having to resort to Java or C# glue code to manage the interaction.
This article introduces MLSQL, its architecture and API, and demonstrates a few real life applications built on the library.

Architecture

MLSQL consists of a couple hundred lines of XQuery and a few hundred lines of Java servlet code. The XQuery uses xdmp:http-post() to send the SQL query to the servlet, and the servlet uses JDBC to pass the query to the relational database. Any database with JDBC support will do. The servlet then returns the result set or update details as XML over the wire to the XQuery environment.
The servlet holds the database credentials, connection details, and manages a connection pool for efficiency. The XQuery can pass "bind parameters" to the database as part of the query for efficiency and safety. The diagram below shows the architecture. 

Installation and configuration details follow at the end of this article.

Credentials and Security

For simplicity in MLSQL 1.0 we've opted to encapsulate the database details inside the servlet. An alternate design would allow the XQuery to pass the details, such as in the query string. If you're interested in this, talk to us. Note that for database security, you should restrict access to the MLSQL web service to only the trusted MarkLogic host. Tools like tcpwrappers makes this easy.

MySpace Begins from here

First of all I want to thank Jason Hunter, developer of this technology. One of the below reasons you may need MLSQL in your Application,
·      If you have one application in Marklogic and another one in Relational Database for same purpose. So you need synchronization between these two databases.

Installation

For installation you need following applications in your machine,
  •         JDK (1.6 or any advanced version)
  •          Tomcat Server(6.0 or any advanced version)
  •          Marklogic Server 5.0
  •          MySql Database( or any relational database)
  •          Winant (ant builder software)
Prior to this installation you should have,
  •        In Marklogic
a.       Your application Database(TestDatabase)
b.      Application Server(TestHttpServer)
  •         In MySql
a.       Your application Database(TestDatabase)
b.      Table which you want to update from ML(TestTable)
·         Download MLSQL source from
Step 1
                Copy the C:\Folderlocation\mlsql-src-1.1\client\ sql.xqy library module to your XQuery code(Root directory which mentioned in application server). This usually requires copying the file into your HTTP server root directory, but depending on configuration might require loading into the server's modules database.
Step 2
Installing servlet code,
  • Run the “ant” command in the command prompt by using following steps
  • Go to extracted folder location (i.e. C:\Folderlocation\mlsql-src-1.1\ )
  • Now type “ant” and press enter
  • Now you can get see one additional folder named “C:\Folderlocation\mlsql-src-1.1\ buildtmp
  •  Copy “C:\Folderlocation\mlsql-src-1.1\ buildtmp\mlsql” and paste it into “C:\Tomcat60\webapps”.
Step 3
  • Provide database credentials and connection details.
  • The web application includes a web.xml file with init parameters the servlet will read in order to connect to the database. Adjust these as appropriate for your system.
  • You'll need to set the JDBC driver, connection URL, username, and password. The web.xml file includes instructions.
Step 4
    Make sure to add your JDBC driver JAR files to the WEB-INF/lib directory. Driver libraries aren't usually licensed for redistribution, so you'll have to obtain them on your own.
Step 5 
     test the servlet's URL. Try making a basic web request to the MLSQL servlet. You'll know it's listening and you've found it when you get a <sql:exception> talking about "Premature end of file". That means you didn't provide a query, but at least you found the servlet.
Step 6
  •  Test the XQuery connection. Write a basic query like this:
import module namespace sql  = "http://xqdev.com/sql" at "sql.xqy"
sql:execute("select 1 as test", "http://localhost:8080/mlsql", ())
  •  You might need to adjust the "at" depending on where you placed the library module. If it works you'll see this result:
<sql:result xml:lang="en" xmlns:sql="http://xqdev.com/sql">
  <sql:meta/>
  <sql:tuple>
    <test>1</test>
  </sql:tuple>
</sql:result>

 What Next?

 Now you can View, Create, Update and Delete records from your Relational database by using “MLSQL Module”. I will share the sample code for this operations one by one before that I want to mention some details on sample code
·         Localhost – indicates your local machine details
·     Import Module – I have the sql.xqy file in the same directory(Note: You might need to  adjust the "at" depending on where you placed the library module)
·         Test – Relational DB Table Name
·      In sql:execute(SQLQuery,URL,Options) function – Maintain following parameters for    this function
o   SQLQuery – SQL query for your requirement
o   URL – Pointing the Servlet which is placed in the Tomcat server
o   Options – The third argument to sql:execute() is an XML options node that can include various things including a list of bind parameter values.

View records query

                 Using this XQuery you can view all of your records from Relational DB. Note: If you go for large table you have to face performance issues.
Sample.xqy
import module namespace sql  = "http://xqdev.com/sql" at "sql.xqy"
sql:execute("select  * from test", "http://localhost:8080/mlsql", ())
Results
<sql:result xmlns:sql="http://xqdev.com/sql">
  <sql:meta/>
  <sql:tuple>
    <id>1</id>
    <fname>Jason</fname>
    <lname>Hunter</lname>
    <age>32</age>
  </sql:tuple>
  <sql:tuple>
    <id>1235</id>
    <fname>Ryan</fname>
    <lname>Grimm</lname>
    <age null="true"></age>
  </sql:tuple>
</sql:result>
 
  
Once you got this result in Marklogic you can reconstruct the Result XML as per your Requirement and you can change the SQL Query parameter also.

Bind Variables with your query

                 By using the third parameter of sql:execute() you can bind variables with you SQL query.
Sample.xqy
import module namespace sql  = "http://xqdev.com/sql" at "sql.xqy"
sql:execute("select  * from test where id = ?", "http://localhost:8080/mlsql", sql:params(1235) )
Results
<sql:result xmlns:sql="http://xqdev.com/sql">
  <sql:meta/>
  <sql:tuple>
    <id>1235</id>
    <fname>Ryan</fname>
    <lname>Grimm</lname>
    <age null="true"></age>
  </sql:tuple>
</sql:result>
Note

         Above SQL query returns record which has the id = "1235".

Performance

In simple testing on a laptop, the MLSQL servlet was able to push about 1,000 simple tuples per second toward the client. We anticipate that's fast enough for most SQL queries you will write from XQuery. Just don't try to pull a whole table into the XQuery environment and process it with XPath. That's why SQL has WHERE clauses.

No comments: