Blog, Trixi

Multitenancy using Spring and PostgreSQL

Let’s talk about multitenancy in Java. There is indeed a lot of possible requirements and even more solutions. When you try googling word “multitenancy” you can find mostly articles meditating on general aspects of the subject. I decided to talk about multitenancy from more practical point of view.

Requirements

My requirements are:

  • One tenant corresponds to one paying customer (e.g. a company).
  • Need a web application that is able to serve multiple tenants. This application communicates through https and may utilize http sessions on the server side.
  • Every customer has unlimited number of users (e.g. employees of the company).
  • Every customer needs to be supplied any number of application databases he needs. On the client side, every user may switch between these databases as he wants, but he may use only one database at a time.
  • Every customer/tenant has exactly one shared database. This database is shared by all application databases. It contains list of application databases, list of application users and some more customer-wide data.
  • All tenants/customers are listed in a special root database. There may also be stored any other content that needs to be shared between tenants.
  • The multitenancy feature must be quite invisible for a common business logic. It must not require any changes in a business code, DAOs and SQL queries.
  • The application URL is tenant-specific. Users belonging to tenant42 must access the application through the URL that looks like https://<myapp>/tenant42/<request>. Hence on the server side it is always possible to determine the current tenant we are working for.

The easiest way to understand the requirements is to look at the picture.

Solution – database

Now I need to map the mentioned requirements into a concrete database server. I have chosen PostgreSQL which is both free and usable.

After some thinking, browsing and testing the final solution emerged:

  • Every tenant has its own dedicated pgsql database.
  • Application databases are represented by pgsql schemas named app-schema1, app-schema2, …
  • Shared database is represented by another pgsql schema named shared-schema

The solution brings these apparent implications:

  • Each tenant’s data are completely separated from the other tenants.
  • A foreign key may be created between application database and shared database since they are located in the same pgsql database (it is possible to create FK from one schema to another).
  • It is necessary to specify desired pgsql schema in all queries to the database. The server must know which table in which schema it is working with. It may be solved by setting an appropriate search_path before query execution.

Solution – datasources and connection pools

The search_path is always set per database connection. If I’d decided to share connections between tenants, I’d really have to switch search_path before every single query.

But that’s not what I want. Instead, I decided to implement a solution where every schema from every tenant has it’s own dedicated set of database connections. I implemented a “tenant and schema aware” datasource with a smart getConnection() method. It maintains a connection pool for every schema and when called it always returns a connection with an appropriate search_path already set.

Let’s see how the TenantAwareDataSource looks like:

  • It maps each database schema to its own pooled datasource (e.g. DBCP’s BasicDataSource).
  • It uses special TenantContext bean to detect current tenant and application database (schema). I will describe it in the next chapter.

TenantAwareDataSource.java


import javax.sql.DataSource;
import org.apache.commons.dbcp.BasicDataSource;
import org.springframework.jdbc.datasource.AbstractDataSource;

public class TenantAwareDataSource extends AbstractDataSource {

	private TenantRepository tenantRepository;

	private TenantContext tenantContext;

	// [tenant,schema] to [DataSource] mapping
	private final Map<String, DataSource> schemaToDataSource = new HashMap<String, DataSource>();

	/**
	 * @return Take current tenant+schema and convert it to string token
	 */
	protected String currentLookupKey() {
		String tenantId = tenantContext.getTenant().getKod();

		if (tenantContext.getSchemaId() == null)
			return tenantId + "/<noschema>";
		else
			return tenantId + "/" + tenantContext.getSchemaId();
	}

	/**
	 * @return DataSource for current tenant+schema
	 */
	protected DataSource determineTargetDataSource() {

		// lookupKey represent current tenant and schema
		String lookupKey = currentLookupKey();

		DataSource dataSource = schemaToDataSource.get(lookupKey);

		if (dataSource == null) {
			dataSource = createDataSourceForTenantAndSchema();
			schemaToDataSource.put(lookupKey, dataSource);
		}

		return dataSource;
	}

	/**
	 * Create new pooled datasource for current tenant and schema. It must be
	 * additionally configured to always return connections with a search path
	 * set to current schema.
	 */
	private DataSource createDataSourceForTenantAndSchema() {
		// tenantContext is a session-scoped spring bean
		String tenantId = tenantContext.getTenant().getId();
		String schemaId = tenantContext.getSchemaId();

		Tenant tenant = tenantRepository.getTenant( tenantId );

		// created simple pooled datasource (like Apache DBCP) based
		// on tenant database connection settings
		DataSource dataSource = createDataSource(tenant) ;

		String appSchema = schemaId != null ? "'" + schemaId + "'," : "";
		String searchPath = "SET search_path = " + appSchema + DbConst.PUBLIC_SCHEMA + ";";

		// set search path for the datasource
		// (for DBCP use BasicDataSource.setConnectionInitSqls(Collection))
		configureSearchPath(dataSource, searchPath);

		return dataSource;
	}

	public void destroy() {
		// dispose all the datasources
	}

	@Override
	public Connection getConnection() throws SQLException {
		DataSource determineTargetDataSource = determineTargetDataSource();
		return determineTargetDataSource.getConnection();
	}

}

Solution – tenant context

I have already mentioned a TenantContext bean. It shall provide information about current tenant and application schema (current means belonging to currently processed request).

TenantContext class is just a simple holder class of the required information.

public class TenantContext implements Serializable {

    private Tenant tenant;

    private String schemaId;

    // getters & setters
    // ...
}

TenantContext bean is set to be session scoped which means that its content is bound to current request’s HTTP session.

<bean id="tenantContext" class="TenantContext" scope="session">
    <aop:scoped-proxy />
</bean>

So now I have a session-bound bean that should know everything about current tenant. The last thing is how and when to initialize such a bean with an appropriate data. There is a requirement in the first chapter of this post that the application URL is tenant-specific. It means I can deduce what is the current tenant just from looking at the request’s URL.

Here I present a solution based on Spring Security and servlet filters.

Firstly, there is a simple TenantFilter that is called before every request. It manages tenant property of every session-bound TenantContext.

import javax.servlet.Filter;

public class TenantFilter implements Filter {

 private TenantContext tenantContext;

 private TenantRepository tenantRepository;

 @Override
 public void doFilter( ServletRequest request, ServletResponse response, FilterChain chain ) throws IOException, ServletException {
 final HttpServletRequest servletRequest = ( HttpServletRequest ) request;

 String urlTenantId = parseTenantFromUrl( servletRequest.getPathInfo() );

 if ( tenantContext.getTenant() == null ) {
 // tenant context is not initialized
 // it means this is the first request in the session
 // let's set the tenant context
 Tenant tenant = tenantRepository.getTenant( urlTenantId );

 if ( tenant == null )
 throw new TenantException( "Unknown tenant: " + urlTenantId );

 tenantContext.setTenant( tenant );
 }
 else {
 // tenant context already initialized
 // check whether it is as expected
 String sessionTenantId = tenantContext.getTenant().getKod();

 if ( !sessionTenantId.equals( urlTenantId ) )
 throw new TenantException( "Unexpected tenant [session=" + sessionTenantId + ", url=" + urlTenantId + "]" );
 }

 chain.doFilter( request, response );
 }

 /**
 * Parse tenant identifier from the request's URL.
 */
 private String parseTenantId( String requestRelativeUri ) {
 // ...
 }

 // setters and other uninteresting methods...
 // ...
}

This filter is properly injected into Spring Secutiry framework.

<bean id="tenantFilter" class="TenantFilter">
    <property name="tenantContext" ref="tenantContext" />
    <property name="tenantRepository" ref="tenantRepository" />
</bean>

<security:http auto-config='false' >
    <security:custom-filter before="FIRST" ref="tenantFilter" />
    <!-- ...more security stuff... -->
</security:http>

The schemaId property of the TenantContext is set a little bit later in the session when the logged-in user is forced to choose one of the existing schemas. There’s no special magic about it.

Conclusion

I briefly described how to implement multitenancy in Java. I started from the database layer where I presented a database layout using multiple PostgreSQL databases and schemas. Then I showed the implementation of the smart JDBC datasource and related tenant context.

I’m planning to write the second part of this blog post. It wil be focused mainly on implementing a business layer into the multitenant application described here.

Tags: , , ,

Posted in programming


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>