About JCR queries

In the past days 2 interesting blog posts have been written about the use of JCR query, Dan Klco’s “9 JCR-SQL2 queries every AEM developer should know” and “CQ Queries demystified” by @ItGumby.

Well, when you already have read my older articles about JCR query (part 1 and part 2), you might get the impression that I am not a big fan of JCR queries. There might be situations where that’s totally true.

When you come from a SQL world, queries are the only way to retrieve data; therefor many developers tend to use query without ever thinking about the other way offered by JCR: the “walk the tree” approach.

@ItGumby gives 2 reasons, why one should use JCR query: efficiency and flexibility in structure. First, efficiency depends on many factors. In my second post I try to explain which kind of query are fast, and which ones aren’t that fast. Just because of the way the underlaying index (even with AEM 6.0 it’s in 99,9% still Lucene) is working. With the custom indexes in AEM6 we might have a game changer for this.
Regarding flexibility: Yes, that’s a good reason. But there are cases, where you have a specific structure, when you are looking for hits only in a small area of the tree. But if you need to search the complete tree, a query can be faster.

Dan gives a number of good examples for JCR queries. And I wholeheartedly admit, that the number of JCR SQL examples in the net is way too low. The JCR specification is quite readable for a large part, but I was never really good at implementing code when I only have the formal description of the syntax of the language. So a big applause to Dan!
But please allow me the recommendation to test every query first on production content (not necessarily on your production system!), just to find out the timing and the number of results. I already experienced cases, where an implementation was fast on development but painfully slow on production just because of this tiny aspect.

Managing multiple instances of services: OSGI service factories

And now the third post in my little series of OSGI related postings. I already showed how you can easily manage a number of services implementing the same interface using a service tracker or by using the right SCR annotations.

Sometimes you need to implement services, which just differ by configuration. A nice example for this is the logging, where you want to have the possibility to have multiple logging facilities being logged to log files at a different level. Somebody (normally the admin of the system) is then able to leverage this and configure the logging as she likes.

So more formally spoken you have zero to N instances of the same service, but just with different configuration. Just duplicating the code and create a logger1 with configurable values, a logger2, logger3 and so doesn’t make sense, as it’s just code duplication and inflexible (what happens, when you need logger100?).

OSGI offers for this kind of problem the concept of ServiceFactories. As the name already says, it’s a factory to create services, just by configuration.

As an example let’s assume, that you need to send out emails via a configurable number of SMTP servers, because for internal emails you need to use a different mailserver than for external users or partners. We will implement this as service, and on the service we will configure the details of the SMTP service we intend to use.

So let’s start with the service interface:

public interface MailService {
  public void sendMail (String from, String to, String body);
}

And a dummy implementation could look like this; we want to focus only on the properties, and not really on the details how to send emails :-)

@Service
@Component(metatype=true,label="simple mailservice", description="simple mailservice")
public class MailServiceImpl implements MailService {

  private static final String DEFAULT_ADDRESS="localhost:25";
  @Property (description="adress of the SMTP server including port",value=DEFAULT_ADDRESS)
  private static final String ADDRESS = “mailservice.address”;
  private String address;
  
  private static final String DEFAULT_USERNAME="admin";
  @Property(description=“username to login to the SMTP server”,value=DEFAULT_USERNAME)
  private static final String USERNAME = “mailservice.username”;
  private String username;

  private static final String DEFAULT_PASSWORD="admin;
  @Property(description=“password to login to the SMTP server”,value=DEFAULT_PASSWORD)
  private static final String password = “mail service.password”;
  private String password;

  @Activate
  protected void activate (ComponentContext ctx) {
    address = PropertiesUtil.toString(ctx.getProperties().get(ADDRESS),DEFAULT_ADDRESS);
    username = PropertiesUtil.toString(ctx.getProperties().get(USERNAME),DEFAULT_USERNAME);
    password = PropertiesUtil.toString(ctx.getProperties().get(PASSWORD),DEFAULT_PASSWORD);
  }

  public void sendMail(String from, String to, String body) {
    // login to the smtp server using address, username and password provided via OSGI 
    // properties and send the email
  }
}

But how can you extend this and make sure, that you cannot just create a configuration for 1 mailserver, but for multiple ones? Easy, just add the property “configurationFactory=true” the @Component annotation.

@Component(metatype=true,label="simple mailservice", description="simple mailservice",
  configurationFactory=true)

If you compile and deploy your service now, you can see in your Apache Felix Configuration Manager, that you have a “plus” sign in front of the service; and when you click it, you get a new instance of your service, which you can configure.

ConfigurationFactory in the Felix Console

ConfigurationFactory in the Felix Console

But when you have 3 mail services configured, which one do you get when you have something like this:

@Reference
MailService mailService;

The answer: It’s not deterministic. You might get any of the configured mailservices. If you need a special one, the easiest way is to provide labels for them to make them unique. So add another property to your service:

@Property(description=“Label for this SMTP service”)
private static final String NAME = “mailservice.label”

And when you create then a configuration with the label “INTERNET”, you can reference exactly this service instance with this kind of reference:

@Reference(“(mailservice.label=INTERNET)”)
MailService mailService;

This will resolve correctly when you have a mailservice service configured with the label “INTERNET”. As long as you don’t have such a service, any service containing such a reference won’t start (unless you create a dynamic reference …)

If you want to be more flexible and also implement some more logic in the lookup process (e.g. having a default mailservice or supporting a number of INTERNET mail services), you can use the whiteboard pattern to track all available MailService instances; based on their labels you can implement any logic you need.

As you see, OSGI is quite powerful when it comes to looking up and connecting to services. Combined with the power of SCR you can easily create lot of configurable services with very little effort. Managing these services and doing proper lookup is also just a few lines of code away.

Personally I really like the possibilities I have with the OSGI container inside of AEM, it gives me the flexibility to access lots of different parts of the system. And creating and injecting my own services is easier than ever.

The magic of OSGI: track the come and go of services

Have you already asked yourself, how it comes, that you just need to implement an interface, mark the implementation as service, and — oh magic — your code is called at the right spot. Without any registration.
For example when you wrote your first sling servlet (oh sure, you did it!) it looked like this:

@Service
@Component
@Property (name=“sling.servlet.paths”, value=“/bin/myservlet”)
public class MyServlet extends SlingSafeMethodServlet  {
  ...
}

and that’s it. How does Sling (or OSGI or whoever) knows that there is a new servlet in the container and calls it, when you visit $Host/bin/myservlet. Or how can you build such a mechanism yourself?

Basically you just use the power of OSGI, which is capable to notify you about any change of the status of an existing bundle and service.

If you want to keep track of all Sling servlets registered in the system you just need to write some more annotations:

@Service
@Component(value=ServletTracker.class)
@Reference (name=“servlets”, policy=ReferencePolicy.DYNAMIC, cardinality=ReferenceCardinality.OPTIONAL_MULTIPLE, interfaceReference=Servlet.class)
public class ServletTracker {

List<Servlet> allServlets = new ArrayList<Servlet>();

protected bindServlets (Servlet servlet, final Map<String, Object> properties) {
  allServlets.add (servlet);
}

protected unbindServlets(Servlet servlet, final Map<String,Object> properties) {
  allServlets.remove(servlet);
}

public doSomethingUseful() {
  for (Servlet servlet : allServlets) {
    // do something useful with them ...
  }
}

(Of course you can track any other interface through which services are offered. But be aware, that in many cases only a single instance of a service exists.)

The magic mostly is in the @Reference annotation, which defines that there is optional reference, which takes zero till unlimited references of services implementing the class/interface “Servlet”.  By default methods are called, for which the names are constructed using the “name” statement, resulting in “bindServlets” when a new servlet is registered, and “unbindServlets” when the servlet is unregistered. You can use these methods to whatever you want to do, for example storing the references locally and calling them whenever appropriate. And that’s it.

If you use this approach, your code is called whenever some service implementing a certain interface is being activated/deactivated. With the SCR annotations it’s all possible without having too much trouble and the best of all: Nearly all just by declaration.
If you like to have some more control over  (or just want to code) you can use a ServiceTracker (a nice small example for it is the Apache Aries JMX Whiteboard) to keep manually track of services.

And as recommended reading right now: Learn all the other cool stuff at the Felix SCR page.

And if you need to have more code which uses this approach, you might want to have a look at the SlingPostServlet, which is a excellent example of using this pattern. Oh, and by the way: This pattern is called OSGI whiteboard pattern.

OSGI: static and dynamic references

OSGI as component model is one of the cores of AEM, as it allows to dynamically register and consume services offered by other parts of the system. It’s the central registry you can ask for all kind of services.

If you have some weeks of CQ experience as developer, you probably already know the mechanics how to access a product service. For example one of the most often used statements in a service (or component) is:

@Reference
 SlingRepository repo;

which gives you access to the SlingRepository service, through which you can reach the JCR. Technically spoken, you build a static reference. So your service gets active only when this reference can be resolved. By this you can rely on the repository being available whenever your service is running. This is a constraint which is not a problem in many cases. Because it wouldn’t make sense for your service to run without the repository, and it also frees to permanently checking “repo” for being not null :-)

Sometimes you don’t want to wait for a reference to be resolved (maybe breaking a dependency loop) or you can just deliver additional value if a certain (optional) service is available. In such cases you can make it a dynamic reference

@Reference(policy=ReferencePolicy.DYNAMIC)
SlingRepository repo;

Now there’s no hard dependency to the SlingRepository service; so your service might get active before the SlingRepository service is available, and therefor you need to handle the case that “repo” is null.

Per se this feature might have little importance to you, but combining it with other aspects makes it really powerful. More on that in the next post…

AEM 6.0 and Apache Oak: What has changed?

One of the key features of AEM6.0 on the technical side is the use of Apache Oak as a much more scalable repository. It supports the full semantic of JCR 2.0, so all CQ 5.x applications should continue to work. And as an extension of this feature, there is of course mongoDB, which you can use together with Oak.

But, as with ever major reimplementation, something has changed. Things, which worked well on Jackrabbit 2.x and CRX 2.x might behave differently. Or to put in other words: Jackrabbit 2.x allowed you to do some things, which are not mandated by the JCR 2.0 specification.

One of the most prominence examples for this is the visibility of changed nodes. In CRX 2.x when you have an open JCR session A, and in a different session B some nodes are changed, you will see these changes immediately in session A. That’s not mandated by the specification, but Jackrabbit supports it.

Oak introduced the concept of MVCC (multi version concurrency control), which makes that each session only sees a view of the repository, which has been the most recent one the session has been created, but it’s not updated on-the-fly with the changes performed by other sessions. So this is a static view. If you want to get the most recent view of the repository, you need to call explicitly “session.refresh()”.

So, what’s the effect of this?
You may run into subtle inconsistencies, because you don’t see changes performed by others in your session. In most cases, only long-running sessions are really affected by this, because for them it’s often intended to react on changes from the outside, and that you can react on changes made by other threads (e.g. you can check if a certain node has already been created by another session). So if you already have followed the best practices established in the last 1-2 years, you should be fine, as long-running sessions have been discouraged. I also already showed, how such a long-running session might affect performance when used in a service context.

Oak supports you with some more “features” to spot such problems more easily. First, it prints a warning to the log, when a session is open for more than 1 minute. You can check the log and review the use of this sessions. A session being open more than 1 minute is normally a clear sign, that something’s wrong and that you should think about creating sessions with a smaller lifespan. On the other hand you can imagine also cases, where a session open for some more time is the right solution. So you need to carefully evaluate each warning.
And as second “feature”, Oak is able to propagate changes between sessions, if these changes are performed by a single thread (and only by a single thread).
But consider these features (especially the change propagation) as transient features, which won’t be supported forever.

This is one of the interesting parts of the changes in Apache Oak compared to Jackrabbit 2.x, you can find some more in in the Jackrabbit/OAK Wiki. It’s really worth to have a look at when you start with your first AEM 6.0 project.

AEM 6.0: Admin sessions

With AEM6.0 comes a small feature, which you should use to reconsider your usage of sessions, especially the use of admin sessions in your OSGI services.

The feature is: “ResourceResolverFactory.getAdministrativeResourceResolver” is going to be deprecated!

Oh wait, that should be a feature, you might ask. Yes, it is. Because it is being replaced by a feature, which allows you to easily replace the sessions, previously owned by the admin (for the sake of laziness of the developer …) by sessions owned by regular users. Users which don’t have the super-power of admin, but regular users, which have to follow the ACLs as any other regular user.

A nice description how it works can be found on the Apache Sling website.

But how do you use it?

First, define what user should be used by your service. Specify this in the form “symbolic-bundle-name:sub service-name=user” in the config of the ServiceUserMapper service.
Then there 2 extensions to existing services, which leverage this setting:

ResourceResolverFactory.getServiceResourceResolver(authenticationInfo) returns a ResourceResolver created for the user defined in the ServiceUserMapper for the containing bundle (you can specify the sub service in the authenticationInfo if required).

And the SlingRepository service has an additional method loginService(subserviceName, workspace), which returns you a session using this user.

But then this leaves you with the really big task: What permissions should my service user have then? read/create/modify/delete on the root node? But that’s something you can delegate to the people who are doing the user management …

Update 1: Sune asked if you need to specify a password. Of course not :-) Such a requirement would render the complete approach redundant.

Meta: AEM 6.0

I am a bit behind the official announcement of AEM 6.0 (Adobe TV, docs ), also some of my colleagues have taken the lead and already started posting about the major new technical features, Apache Oak (including MongoDB) and Sightly. My colleague Jayna Kandathil offers a nice overview of the technical news.

I will focus on the smaller changes in the stack, and there’s a vast number of it. So stay tuned, I hope to find some quiet moments to blog in the next 2 weeks.