Universal Geocoder Server Reference Guide

Geocoding is a procedure applied to a number of address records or site locations to generate, for each location, a pair of X and Y coordinates. For more information on geocoder terminology (types of geocoding, tolerances…), refer to the Universal Geocoder user documentation.

This document describes the Geoconcept Universal Geocoder component.

The description of the component breaks down into three separate sections:

  • UGC JEE: this is a component designed to handle the integration of the Universal Geocoder (UGC) geocoding engine in the Java Enterprise Edition platform and its subsets, like the Tomcat servlet engine. By extension (deployment of an optional module) the product can be used with the aim of deploying a geocoding web service.
  • UGC Command Line
  • UGC .NET
[Note] Note

This documentation is not the documentation of Geoconcept Universal Geocoder standalone.

JEE Universal Geocoder

This component is designed to handle the integration of the Universal Geocoder (UGC) geocoding engine in the Java Enterprise Edition platform and its subsets, such as the Tomcat servlet engine. By extension (deployment of an optional module) the product can be used with the aim of deploying a geocoding web service.

From a functional point of view, geocoding is a procedure applied to a series of address records to obtain their geographic coordinates. For more information about the terminology of the geocoder (types of geocoding, tolerances…) refer to the Universal Geocoder user manual. This manual aims to describe in detail how to deploy Universal geocoder specifically for Java Enterprise Edition (ugc-jee).

Basic principles

JEE Integration

ugc-jee is a JEE platform integration component providing a geocoding service to JEE modules deployed on an application server (in the widest sense, Tomcat type of servlet engine included).

JEE Integration
gcweb-reference-img/guide-reference-ugc/ugc-guide-reference1.png

The consumer module can be of any type (webapp, ejb, etc). Similarly, inside the JEE module the consumer can be of any type (servlet, hsp, pojo, etc).

Access mode

The application using the geocoder references the provider via a logical name in the JNDI (Java Naming and Directory interface) directory of the application server. The method to use for setting up the provider will be described later on.

Traditionally, the logical name used is the “java:comp/env” context name “geoconcept/ugc/default”, but it is also possible to use another name, another context, or to set up several providers of different types.

In the interests of simplicity, we will start by describing the most usual utilisation scenario: a single primary provider with naming by default.

Access mode
gcweb-reference-img/guide-reference-ugc/ugc-guide-reference2.png

Component structure

Adapting the JEE resource and external components

ugc-jee breaks down into several parts. For questions of performances and re-utilisation, the engine (also called the kernel) is written in C++ and is therefore published in the form of native libraries. The integration part (resource adapator) handles the link with the engine by handling its deployment via loading its native libraries. In terms of deployment of files, this then concerns two distinct trees:

Component structure
gcweb-reference-img/guide-reference-ugc/ugc-guide-reference3.png

In terms of execution, the native libraries will be loaded into memory in the process of the application server instance (or, where appropriate, its partition, depending on the model) at start-up. The resource adaptor part features a java interface and handles the instanced engine in memory directly via JNI.

Component structure
gcweb-reference-img/guide-reference-ugc/ugc-guide-reference4.png

Another type of provider

The resource adaptor described previously corresponds to the LocalDll type, and this consists of the primary provider, and namely the one that is linked to the geocoding enfine and in practise, handles processing.

In some situations (for various reasons: separated frontal, sharing, architecture, insulation, etc) it is desirable that the module consuming the service is not deployed in the same instance of the application server as the primary ugc provider (for example, each being one being on a different server machine - remoting).

In this case there are several possibilities:

  • either, the creation of a dedicated publication module, that will have access to the uge provider (in this case, the protocol used and the procedures are specific to the module created).
  • or the consumer is implemented like a client from a module supplied by ugc-jee like, for example, a web service of the ugc-ws module.
  • or a module provided by ugc-jee is deployed on the primary provider’s machine and a client resource adaptor for this module is set up on the consumer machine.

This last scenario is transparent for the consumer application: whether the provider is local or remote, the accecss code remains the same, it is simply a question of deployment and configuration (it’s a bit like the Multiple business delegate pattern).

The transport protocol used depends on the publication module / client pair chosen: this could be web service (http/soap) or rmi.

Another type of provider
gcweb-reference-img/guide-reference-ugc/ugc-guide-reference5.png

In every scenario, in the framework of the deployment of ugc-jee in installed mode (that is, non SAAS mode) there is a provider instance of the LocalDll type. In other respects, and to the extent that this provider is local (no transfer, no serialisation, etc) it is the solution offering highest performances (latency, bandwidth).

Add-on modules

A certain number of add-on modules are supplied with ugc-jee. None are essential to the correct operation of the resource adaptor, and their deployment is optional.

ugc-admin

ugc-admin is a webapp that checks the product has been correctly installed, handles the configuration and tests the application to ensure it is functioning correctly.

ugc-ws

ugc-ws is a webapp that publishes a geocoding web service. It is based on spring-ws (it publishes in document / literal encoding format).

ugc-axis-fusion

ugc-axis-fusion is a webapp that publishes a predeployed geocoding web service on Axis 1.4 (it publishes in rps soapenc encoding format).

ugc-remote

ugc-remote is a webapp for publishing the geocoding service via rmi. It is notably used for remoting in conjunction with the RmiClient type provider.

Administration / Configuration

Configuring the application deployment

A repository made up of a series of files with .ugc.xxi file extensions, constitutes a datasource (datasource).

The service.xml file located in %ugc%/conf defines the general configuration of the geocoding service provider. This contains notably a default configuration for the datasources (default‑datasource). You can define a specific configuration to be assigned to a datasource.

To define a specific fonfiguration for a datasource, it is advisable to use the ugc‑admin sebapp (see the section on DataSourcesConfiguration, and then configuration > create). It is also possible to edit the service.xml file directly.

If no specific configuration has been defined, then the default configuration is applied on deployment of this datasource.

In other respects, it is possible to define certain parameters during the call (via findAddressOptions). In this case the concrete value that a parameter takes (like, for example discrepancy) may originate (in order):

  • from the call if it is assigned in findAddressOptions (this assignment is optional)
  • from the value indicated at the level of the specific datasource configuration (if a specific datasource configuration has been defined)
  • from the value indicated at the level of the default datasource configuration (default-datasource).

Detailed configuration of a datasource

It is possible to define the configuration of the deployment of a referential in a very detailed way.

This definition may correspond to particular needs, and it is not essential since the default configuration will serve perfectly well in the majority of cases.

Some parameters can be defined at the time of the call, and others are set at the creation of the datasource instance and are then subsequently not modifiable.

The parameters you may want to define at the moment of the call will be described in the FindAddressOptions section.

Datasource identification

This section enables identification of a datasource in the administration.

File: geocoding referential utilised. This is contained in %geoconcept%/ugc/reftables or in a sub-directory.

Name: an alias for the name of the datasource (optional)

Publication information

This section allows you to add information about the datasource.

Title:: a title for the datasource.

Abstract: contains a short description of the datasource.

Online resource: HTTP link giving more information on the datasource.

Reference table settings

Version: version of the referential.

Zone meaning: meaning of the "zone" attribute for the address. Example: Post code.

UniqueID meaning: meaning of the uniqueId search identifier.

secondaryZone meaning: meaning of the secondayZone attribute for the address. For example, IRIS code.

StreetSectionId meaning: meaning of the streetSectionId attribute for the address. For example, PID navteq.

Coordinate system: identifier for the reference coordinates system for the publication (Cf. OpenGIS standard).

Country: the country concerned.

Bounds: rectangular footprint for the data in the referential

Run time settings

Cache: activation or not of the cache. This cache enables towns in the referential to be kept in memory.

Max Cache Mem Size is only used if the cache is active. The size of the memory reserved for the cache in Kbytes.

Min processors: the initial number of geocoding instances.

Max processors: the maximum number of geocoding instances.

Finder general settings

City scoring method: calculation method to use for the town score:

  • Standard: rapid, but less precise. Recommended for a batch type of utilisation,
  • Levenshtein: less rapid, but more precise. Recommended for a utilisation of the type where the address is input in a line.

Street scoring method: calculation method to use for the score of the street:

  • Standard: rapid, but less precise. Utilisation is not recommended.
  • Levenshtein : less rapid, but more precise. recommended for a utilisation of the type where the addresses are entered as a line, or in a batch.

Min streets: minimum number of streets for a town to be considered as covered. Used for the tolerance at town level ( FindAddressResults.TOLERATE_TYPE_CITY ).

Search strategy:

For more detail about search strategies, refer to the Universal Geocoder user guide.

Finder request defaults settings

Max candidates: maximum number of results returned by a geocoding operation. See FindAddressOptions.candidateCount.

Score threshold: the score above which the geocoding results are retained. See FindAddressOptions. ScoreThreshold.

Score threshold: threshold for the suggestion of "street" type candidates. See FindAddressOptions. ScoreThreshold.

Find type: type of geocoding desired. See FindAddressOptions.geocodingType.

Tolerate geocoding type: geocoding tolerance required. See FindAddressOptions.tolerateGeocodingType.

Max meter error: maximum positioning error (in metres) for a geocoding at street level to be considered as a geocoding on street number. See FindAddressOptions.maxMeterError. Discrepency: lateral offset. See FindAddressOptions.discrepency.

Discrepancy along street: longitudinal offset. See FindAddressOptions.discrepancy.

Favor city match element: gear the search for the town with a descriptive element for the town. See FindAddressOptions.favorCityMatchElement.

Zone match digits: take into account the n first characters of the area attribute for the address. See FindAddressOptions.zoneMatchDigits.

Tests / Troubleshooting for AXIS 1.4

Check that the resource adaptor has loaded correctly when Tomcat is started. You can use the webapp ugc-admin to validate the installation.

When using a Web Service, check that the AddressFinder service is present in axis:

Tests / Troubleshooting for AXIS 1.4
gcweb-reference-img/guide-reference-ugc/ugc-guide-reference6.png

Java interface

When the provider is used directly inside the java platform (via single java or «POJO» objects), the applet takes the following form:

  • restoration a connection to the provider via JNDI
  • calling the findAddress function

The findAddress function has the following simplified form:

FindAddressResults findAddress(Address, FindAddressOptions);

This means in effect that the call takes as input an address and some options, and returns as output a certain number of candidates.

The precise breakdown is given below.

Package com.geoconcept.ugc.service

Class CodingProvider

getConnection method:

This method allows you to open a connection to the geocoding provider. A connection must be subsequently closed using its Close method.

Prototpye:

getConnection() connection throws ResourceException;

Value returned:

Connection to the geocoding service provider.

Class Connection

findAddress method:

This method enables an address to be geocoded with geocoding options if required.

Prototype

FindAddressResults findAddress(Address address, FindAddressOptions options)

throws InvalidDataSourceException, InternalErrorException;

Parameter Description

address Address to geocode

options Geocoding options

Value returned

Geocoding result list.

Package com.geoconcept.common.geo

Class Location

This class contains the coordinates found during a geocoding operation.

Members

Type Name Description

double

x

X Coordinates

double

y

Y coordinates

String

coordinateSystem

Coordinates system identifier. "SRS" Identifier ("Spatial Reference System") for the Web Map Service. See the OpenGIS standard.

Package com.geoconcept.ugc

Class Address

This class contains the description of an address.

Members

Type Name Description

String

addressLine

Concatenation of the number, of the street type and the name of the street. For example, 25 rue de Tolbiac.

String

zone

Town code (for example, the post code 75013)

String

cityName

Name of the town (for example, Paris)

String

uniqueID

Unique code (town, zone) to search for (for example, 75113000)

String

secondaryZone

Secondary code for the address (for example, the INSEE code, the IRIS code…)

String

StreetSectionID

Road section identifier (not yet utilised)

Class FindAddressOptions

This class contains the configuration for a geocoding operation.

All elements, except for dataSourceName are optional. If they are not assigned, they take their default value, that is optimum in the majority of cases.

Type Name Description

String

dataSourceName

Name of the data source defined in the administrator to use for the geocoding options

short

candidateCount

Maximum number of results to return when geocoding

String

findType

The type of geocoding required. There are three types of geocoding: on towns (FindAddressResults.FIND_TYPE_CITY), on streets (FindAddressResults.FIND_TYPE_STREET) , on street numbers(FindAddressResults.FIND_TYPE_STREET_NUMBER )

String

tolerateFindType

This property allows you to set and obtain the cumulated value of geocodings tolerated. There are three tolerance levels:
- on the town (FindAddressResults.TOLERATE_TYPE_CITY) in the case where the street could not be found and the town is not covered (number of streets for this town is insignificant in the reference table),
- on the street(FindAddressResults.TOLERATE_TYPE_STREET) in the case where the number requested could not be found,
- on the estimated number(FindAddressResults.TOLERATE_TYPE_STREET_ENHANCE), if the number could not be found and the positioning error does not exceed the value of "maxMeterError".
The tolerances are "flags" that can be cumulated. For example, if you tolerate the three categories, you can set a tolerance number of 7, while a tolerance uniquely at town level corresponds to a tolerance with a value of 1. The following table summarises the possibilities that exist:

long

maxMeterError

Maximum positioning error (in metres) to consider a geocoding at street level as a geocoding on street number. The type of geocoding will therefore depend on the length of a street.

double

discrepancy

Orthogonal offset or stagger (in metres) to apply to the street found. This allows you to avoid positioning the geocoded address in the middle of the street.

double

discrepancyAlongStreet

Longitudinal stagger (in metres) to apply. This allows you to avoid positioning the geocoded address on a crossroads.

long

favorCityMatchElement

This allows you to improve the search for a town by specifying the element of the town (the name of the town, the town code, or the unique code for a town for which one can be certain of the nature of the data. The assignment of this value will therefore depend on the address to geocode. Three values are possible: the name of the town (FindAddressResults.FAVOR_CITY_NAME), the code for the town (FindAddressResults.FAVOR_ZONE), the unique code for the town (FindAddressResults.FAVOR_UNIQUE_ID ).

long

zoneMatchDigits

This allows you to set the number of characters to use for the matching of the town code stored in the reference table and that passed to parameter for geocoding. A value can only be specified if the favorCityMatchElement member is different from (FindAddressResults.FAVOR_ZONE), and you need to use the whole of the post code.

int

scoreThreshold

Minimal score for propositions from the geocoder to select

int

scoreThreshold
ToTolerateStreet
GeocodingType

Define a minimum score to ensure candidates of the "street" type are returned. If no street attains this threshold, then the town will be returned as a candidate.

String

coordinateSystem

Define the Reference system for the coordinates to return

Class FindAddressResults

Class that contains the results of a geocoding operation, classified by score.

Members

Type Name Description

FindAddressResult[]

results

Table of candidates found

int

matchType

Type of geocoding applied

Constants

Type Name Value Description

int

FIND_MATCH_TYPE_CITY

1

Type of geocoding requested: the address must be geocoded on the town

int

FIND_MATCH_TYPE_STREET

2

Type of geocoding requested: the address must be geocoded on the street

int

FIND_MATCH_TYPE_STREET_NUMBER

3

Type of geocoding requested: the address must be geocoded on the street number

int

FOUND_MATCH_TYPE_CITY

1

Type of result geocoding: the address has been geocoded on the town

int

FOUND_MATCH_TYPE_STREET

2

Type of geocoding result: the address has been geocoded on the street

int

FOUND_MATCH_TYPE_STREET_ENHANCED

3

Tye of geocoding result; the address has been geocoded on the approximate street number

int

FOUND_MATCH_TYPE_STREET_NUMBER

4

Type of geocoding result: the address has been geocoded on the exact street number

int

TOLERATE_TYPE_CITY

1

Tolerance of the geocoding on the town

int

TOLERATE_TYPE_STREET

2

Tolerance of the geocoding on the street

int

TOLERATE_TYPE_STREET_ENHANCED

4

Tolerance of the geocoding on the estimated street number

int

FAVOR_CITY_NAME

1

Steers the search for the town to be geocoded by providing the name of the town

int

FAVOR_ZONE

2

Steers the search for the town to be geocoded by providing the code for the town

int

FAVOR_UNIQUE_ID

3

Steers the search for the town to be geocoded with the unique code of the town

Class FindAddressResult

Class that contains the result of a geocoding operation.

Members

Type Name Description

Address

address

Address found

Location

location

Coordinates found

double

score

Resemblance score attributed between the address to be geocoded and the address found. Varies between 0 (no resemblance) and 1 (the exact address)

int

type

Type of geocoding found. the possible values are:
- FindAddressResults.FOUND_MATCH_TYPE_CITY
- FindAddressResults.FOUND_MATCH_TYPE_STREET
- FindAddressResults.FOUND_MATCH_TYPE_STREET_ENHANCED
- FindAddressResults.FOUND_MATCH_TYPE_STREET_NUMBER

Utilisation

Guiding principles

A geocoding operation consists of the following procedures:

  • Connect to the geocoding service provider,
  • Construction of the address to geocode,
  • Construction of geocoding options,
  • Execution of the geocoding,
  • Exploitation of the result,
  • De-connection from the geocoding service provider.

Example

Connection to the geocoding service provider

import com.geoconcept.ugc.service.CodingProvider;
import com.geoconcept.ugc.service.Connection;

/*
*Open a connection on the geocoding server
*/
public Connection getConnection() throws Exception
{
        Connection connection = null;
        try
        {
                // get context
                Context initCtx = new InitialContext();
                Context envCtx = (Context) initCtx.lookup("java:comp/env");

                // retrieves the geocoding server form the logical name
                CodingProvider codingProvider = (CodingProvider) envCtx.lookup("geoconcept/ugc/default");
                connection = codingProvider.getConnection();
        }
        catch (Exception e) { e.printStackTrace(); }
        return connection;
}

Construction of the address to geocode

import com.geoconcept.ugc.Address;

/*
*Construct an adress to geocode
*@param  addressLine address number + address  way type + address  way name. Sample : "25 rue de Tolbiac"
*@param   zone adress sone. Sample : "75013"
*@param  cityName city name. Sample : "Paris"
*@param  uniqueId city unique identifier. Sample : "75113000"
*
*/
public Address getAddressToGeocode(String  addressLine, String  zone,String cityName,String uniqueId)
throws Exception
{
        Address address = new Address();
        address.addressLine = addressLine;
        address.zone = zone;
        address.cityName = cityName;
        address.uniqueId = uniqueId;
        return  address;
}

Construction of the geocoding options

import com.geoconcept.ugc.FindAddressOptions;

/*
*Retrieves geocoding options from the data source defined in administration
*@param  datasource Name of a data source
*/
public FindAddressOptions  getOptions(String datasource) throws Exception
{
        FindAddressOptions options = new com.geoconcept.ugc.FindAddressOptions("myDataSource");
     return  options;
}

Execution of the geocoding

import com.geoconcept.ugc.service.Connection;
import com.geoconcept.ugc.Address;
import com.geoconcept.ugc.FindAddressOptions;
import com.geoconcept.ugc.FindAddressResults;

/*
*Launch geocode process and retrieves results
*@param  connection Connection to the geocode server
*@param  address Address to geocode
*@param   options Geocoding options
*/
public FindAddressResults  findGeocode(Connection connection, Address address , FindAddressOptions options) throws Exception
{
        FindAddressResults findAddressResults  = connection.findAddress(address, options);
        return  findAddressResults;
}

Exploitation of the result

import com.geoconcept.ugc.FindAddressResults;
import com.geoconcept.ugc.FindAddressResult;
/*
*Browse geocoding results and display result
*@param  findAddressResults  Results of geocode process
*/
public void displayResult(FindAddressResults findAddressResults) throws Exception
{
        // if at least one found result
        if (findAddressResults.results.length > 0)
        {
                // Display colunm name
                system.out.println("Found results : ");
                system.out.println( "N°"+ "\t"
                                + "Geocoding Type"+ "\t"
                                + "Score"+ "\t"
                                + "Address line"+ "\t"
                                + "Zone"+ "\t"
                                + "City Name"+ "\t"
                                + "City Unique Identifier"+ "\t"
                                + "Adress Secondary Zone"+ "\t"
                                + "Coordinates (Coordinates System)");

                // For each found results
                for (int i = 0; i < findAddressResults.results.length; i++)
                {
                // get next found result
                       FindAddressResult findAddressResult = found.results[i];

                String  coordinateSystem = null;
                if (findAddressResult.location.coordinateSystem != null)
                {
                        coordinateSystem = findAddressResult.location.coordinateSystem;
                }
                else
                        coordinateSystem = "(Unknown)";

                        // Display result
                system.out.println( i + "\t"
                                + findAddressResult.address.type + "\t"
                                + findAddressResult.address.score + "\t"
                                + findAddressResult.address.addressLine + "\t"
                                + findAddressResult.address.zone + "\t"
                                + findAddressResult.address.cityName + "\t"
                                + findAddressResult.address.uniqueId + "\t"
                                + findAddressResult.address.secondaryZone + "\t"
                                + findAddressResult.location.x + ","+ findAddressResult.location.y
                                + "( + coordinateSystem + ")");
                }
        }
        else
        {
                system.out.println("No found result.");
        }
}

De-connection from the geocoding service provider

import com.geoconcept.ugc.FindAddressResult;
import com.geoconcept.ugc.service.Connection;
/*
*Disconnection of the geocode server
*@param  connection  Connection to the geocode server
*/
public void closeConnection(Connection connection) throws Exception
{
         connection.close();
}

Full example

import com.geoconcept.ugc.service.CodingProvider;
import com.geoconcept.ugc.service.Connection;
import com.geoconcept.ugc.Address;
import com.geoconcept.ugc.FindAddressOptions;
import com.geoconcept.ugc.FindAddressResults;
public void geocodingSample()
{
        try
        {
                // Open connection
                Connection connection = getConnection();
                // Construct the address to geocode
                Address address = getAddressToGeocode("25 rue de Tolbiac","75013","Paris","");
                // retrieves geocode options
                FindAddressOptions options = getOptions("myDataSource");
                // launch geocode process
                FindAddressResults findAddressResults = findGeocode(connection, address , options);
                // print geocoding result.
                displayResult(findAddressResults);
                // disconnection
                closeConnection(connection);
        }
        catch(Exception e)
        {
                // geocoding problem
        }
}