Friday, November 9, 2007

ETL: CamptoCamp, FDO and Open Source - What about OLAP?

CamptoCamp & Talend


CampToCamp is introducing (not yet released, but anticipated), a Spatial ETL tool, that works in conjunction with Talend's Open Source ETL product Open Studio.


Once released, I'll begin to work with the product, but CamptoCamp was in Victoria presenting their solution at FOSS4G2007.


More information on their presentation can be found here.


Open Source ETL - Without Spatial


For Open Source ETL, without the Spatial, there are various Open Source solutions. Talend, being one of them.


You can take a look at:


1) Pentaho - URL: http://www.pentaho.com/

2) Clover - URL: http://www.cloveretl.org/

3) KETL - URL: http://www.ketl.org/


Is there another Open Source Spatial ETL Tool?

But we do have another option as well for the Spatial side now that AutoDesk is now working in the Open Source community and they released FDO (Feature Data Objects), which is similiar to FME - but is not an FME.


FDO is a Data Access Technology that was developed to manipulate, define and analyze geospatial data regardless of where it was stored.

FDO was originally developed and included in the Autodesk Map 3D 2005 product during the spring of 2004. In this initial implementation, it was capable of working with the following geospatial types:

  • Oracle
  • SDF

The following version introduced ArcSDE.

The third verision implemented more sources, and added providers for MySQL, SQL Server, ODBC, SHP, Raster, OGC WFS, and OGC WMS.

It was then that they decide to take FDO Open Source, but would not release the Oracle version - but being Open Source, there is always an option out there and some ingenious minds to come out with some solutions.

Quoting the OSGeo FDO History site:

"The release of FDO as open source coincided with the release of MapGuide as open source in 2006. It included the SDF, SHP, MySQL, ArcSDE, ODBC, OGC WFS, and OGC WMS providers. "

FDO was now out to the Open Source World - right on schedule with their release of MapGuide Open Source.

But what about Oracle and FDO?


Much of the work for the Oracle side has been developed by SL-King in Slovenia.

King.Oracle is Open Source FDO provider for Oracle.

Through this product, which is Open Source, SL-King is providing a tool that supports Oracle Locator and Oracle Spatial. It is specifically designed for Oracle and Oracle alone and they are designing it in such a way, that it will support full Oracle Spatial functionality.


Currently the latest version (Version 0.7.3) provides:

  • Support for Oracle 10G, Oracle XE and Oracle 9i
  • Optimized for Oracle
  • Using plain Oracle tables and views
  • Can be used inside AutoCAD MAP 3D to edit and query Oracle data

For a Flash Movie on this, please look here.

Can we convert between FDO different sources?

Yes, of course! This is Open Source.

Coming from SL-King, again, we have another ingenious tool, called FDO2FDO.

FDO2FDO is an Open Source FDO client application which uses the above mentioned Open Source FDO library to manipulate, create, and define geospatial data.

Currently, the software is capable of the following:


  • Copy data from SHP files to SDF
  • From SHP to Oracle
  • Oracle to SDF...

In the end, FDO2FDO allows the user copy and modify any data from any FDO Data Store to any FDO Data Store.


There are three main parts in FDO2FDO and they are:

  1. Fdo2Fdo Api library
  2. F2Fcmd Command line utility
  3. Fdo2Fdo GUI
An introduction to FDO2FDO can be found here.

There are always solutions out there.


Can Geospatial move towards ETL and Data Warehousing?


Yes.

Currently, in order to do web-mapping, you require the following:


  • database
  • web server
  • data

That is all. You can have a web-map up-and-running with MapServer or MapGuide quite easily, definitely within less than a day depending on how complicated and stylish you want to get.

But look deeper and what is it we are after? The data.

Where is the data stored? In a database.

So why don't we work on bringing OLAP into the Internet mapping and GIS worlds?

Aggregations of data can occur anywhere.

You can group data by postal/post/zip codes, populations, etc. - this is prime data for OLAP.

Take industries such as oil and gas - well production is recorded on an hourly basis. This can be summed and aggregated into data marts (OLAP) for daily, monthly, yearly.

The maps are a starting point, but there should be no disconnect between the data, databases, GIS, internet mapping, as we are only working with data and transforming it into much more usable and valuable information.

By bringing OLAP into GIS and Internet mapping, you can add more value to your client's data and this data can be fed into other applications for reporting, etc., etc..

The internet map acts as a portal to a whole other world of information.

Just a few thoughts, as I'm involved in both worlds presently.

What do you think?

Feel free to write and let me know.