Since I started to work with FileNet, I had the chance to see my share of code using the Java API, most of the time written by someone else. And something I noticed is how often people use null or 0 in the recursion argument of the FilterElement. There are positive and negative effects of using these values, but I am not sure people who writes this always knows what it does, so I thought I would explain, even if the P8 Knowledge Center documentation on this topic is quite good I have to say. But it might give you another point of view.
Contents
Understand the importance of type
I’ll start with that because it is really important to work properly with Property Filters. You have to know the type of the property you want to fetch. We can distinguish primary type and object type. Why? Because comportment is completely different between both. Primary types are integer, string, date and so on. Object are relation to other P8 objects
Please use constant when coding
I’ve seen to many time Filter Element built using hard-coded Strings, for example:
new FilterElement(0, null, null, "FolderName", null);
With system properties, you should always use the PropertyNames constant class, it has everything you need, and you will be so relieved you’ve done that on the next P8 migration. It could save you a lot of refractoring time (well it is unlikely that IBM change system property names but we never know). Same applies to ClassNames or any contants from the constants pachage by the way.
Understand the recursion
Here is the main topic of this post. This is what makes a PropertyFilter a good one, and can improve performance a lot. First, read the IBM documentation on the subject because it is really good. There is just one point I will focus on after you read this, it is what happens with object when the current recursion level is the same that the Filter Element recursion level, which happens when you use 0 or null (default value is 0).
In this case, the documentation is clear, but I’ve never really notice what it really does until lately. It creates a unevaluated object or “place-holder“, and as soon as you request the property, it would do a round-trip to the server to fetch the object, without any filter!
Without filter
Let’s use an example, it is always easier to understand that way. Let’s say you want to program a command listing all subfolder’s names of a folder. That’s quite easy, without any filter is looks like this:
Folder f = Factory.Folder.fetchInstance(os, "/TestFolder", null); FolderSet sfs = f.get_SubFolders(); Iterator<Folder> it = sfs.iterator(); Folder sf; while (it.hasNext()) { sf = it.next(); System.out.println(sf.get_FolderName()); }
Before trying to improve, let’s understand what is really happening here. on the first line, the object is retrieved without any filter, so everything is retrieved, with the default recursion level (0), so all primary types are fetch, and all object types get place holders. It looks like this (yes that’s a lot!):
Let’s focus on the SubFolders property. As promise, this is a unevaluated object, not the whole list of folders, which is good (especially if do don’t want to use them, although in our case we will):
Now what happens on the second line:
As you can see by the red color, the getSubFolders actually changed the value of the property. It did a round trip to the server to evaluate the property. An interesting thing to do is to look at what did FileNet retrieve for each subfolder, well… everything…
Imagine this with 10,000 subfolders. If you just want to display the name of the subfolder, it would be too bad to fetch everything.
Reminder: The first fetch is actually quite fast, because it places place-holder for every object-valued property. The slow part is actually on the getSubFolders (and then every 500 subfolders since it is the default pageSize, see my other post about page size), which do the same that the first line, but for each subfolder.
With filter and recursion 0/null
The next exercise I would like to do with you is to do exactly the same, but using a property filter to include only SubFolders and FolderName, the two property we want. But let’s do it with the recursion level at 0, this is usually what I see in existing development.
The code looks like this:
PropertyFilter pf = new PropertyFilter(); pf.addIncludeProperty(new FilterElement(null, null, null, PropertyNames.SUB_FOLDERS, null)); pf.addIncludeProperty(new FilterElement(null, null, null, PropertyNames.FOLDER_NAME, null)); Folder f = Factory.Folder.fetchInstance(os, "/MyFolder", pf); FolderSet sfs = f.get_SubFolders(); Iterator<Folder> it = sfs.iterator(); Folder sf; while (it.hasNext()) { sf = it.next(); System.out.println(sf.get_FolderName()); }
Let’s take a look at the properties the same way we did without filter. After the fourth line:
This is a lot better! We just have our two properties and that’s it. We could think that it is enough and stop there, but let’s look to what’s happening right after the getSubFolders().
This is normal, but still, note that we had to retrieve them from the server, meaning the getSubFolders did a round trip to the server. But now let’s take a look to inside this two folders:
This is really interesting. You can see the getSubFolders actually retrieve all subfolders with all properties, without using the filters you set on the first fetch. Since the longest part of this code is the getSubFolders, not the first fetch, using filter without thinking have the same effect that not using filters at all. If you have thousands of subfolders, this solution won’t be faster than the first one… Even if you put a recursion level of 1 or more on the FolderName property, it won’t have any effect because when FileNet fetch an object from a place holder, it simply doesn’t use the original property filters.
Understand and use properly the Property Filter
Now, how to improve this. For some time now, I try being really strict with myself and always write down what property I need, what is the recursion schema, and how many server round-trips do I want (it usually depends on how much memory I am willing to allow the app to use). So let’s do this together for this example:
I want my folder (recursion level 0), with no properties but its subfolders, so I want the property PropertyNames.SUB_FOLDERS. This is an multi-valued object-valued property, so what do I want for these objects? I only want the name property (PropertyNames.FOLDER_NAME), this is a string property, so a primary type, so I don’t need to think more on this one. And that’s it.
I usually draw a schema before any fetch (here, it is really a simple one, that’s often more complicated):
Level : #0 #1 Folder: SubFolders (Objects) --> FolderName (String)
Now what does it mean in term of FilterElement. I said we want the SubFolder property only for the first level (#0). You could be tempted to do this:
new FilterElement(0, null, null, PropertyNames.SUB_FOLDERS, null);
But it would be a really bad idea, if you got the first part of my post, you now know why, if you didn’t, I’ll explain again. That means that since the recursion level is the same than the FilterElement, all the subfolders will be fetch when calling getSubFolders, and without property filter. So if you have thousands of folders with a lot of custom property, imagine the round-trip it will generate… (the object-valued property from the subfolders would still have place holder though don’t worry) (also it would split in several call depending of the page Size)
So now I think you understand that we don’t want a place holder, it could be nice to get everything we need on the first fetch, when getting the root folder. However if we do that, we need to be really careful on what we fetch on this subfolders, because we might end up fetching their subfolders, their parents, and so on it will be a lot worst than just fetching all subfolders.
The trick is to use a recursion level you want plus one. Here we want the subfolders at the level 0, but we want to fetch them, not to get a place-holder, so let’s use 1.
new FilterElement(1, null, null, PropertyNames.SUB_FOLDERS, null);
Doing this means we need to take care of what we want to fetch for this subfolders. Here, we said we want only their name. We will be on the level #1 of recursion. This is a primary type so we don’t need the same trick, let’s just use 1, which makes sense.
new FilterElement(1, null, null, PropertyNames.FOLDER_NAME, null);
By doing this, we make sure to get a root folder with only the property SubFolders where all folders are fetched (and FolderName as a side effect but we can’t really avoid that, but that’s not too bad), and the subfolders fetched only with one property, the FolderName. All this on only one server round trip (actually, a little bit more than one depending of the page size). Then displaying all folder names will be super fast. And the memory impact is fine since we just store one string per folder. It would have been too bad to fetch all folder properties for each subfolder just for this, and it is often what’s done!
Here is the result in a debug run:
This is right after the fetch of the initial folder, you can see that this time, the SubFolders property is already evaluated, which means that the getSubFolders method won’t do any server round trip (at least not before the end of the first page.
Now let’s take a look to what properties have been fetch for the subfolders. Because it is really nice, but if we fetch everything, we just relocate the problem from the getSubFolders method to the fetch method 😀 So here is one of the two subfolders property cache, You can see we have the FolderName already fetch as we need to, and the SubFolders which is unevaluated (since the recursion level is 1 and we used 1), but it doesn’t matter since we won’t never call getSubFolders and the subfolders, so there won’t be any other server round trip! This is exactly what we wanted, one only light call to the server with everything we need (Actually it is not really one call, since we still are under the pageSize rule :), but we will have a light call every pageSize subfolders when iterating, see my post on Page Size for a better understanding)
And since numbers speak sometimes more than words. Here are some benchmarks (this is over the Internet, and over LAN).
The test was made on 5,000 subfolders, with 11 executions, excluding the first one (to cache data on disk on the server, even if in real life it often won’t be cached, and difference is even bigger when not cached!):
Now a more complicated example for who’s interested.
The delete tree example
We want to do a delete tree on a folder, and we would like to do this by doing only one fetch, meaning one round trip (we are not talking about the deletes, which will be done in a UpdatingBatch).
Basically it means we want all subfolders, all ContainedDocuments, and this recursively on each subfolders. We need to choose a limit of level hierarchy. I would say 99 level is huge and would be more than enough (I mean if someone does more than that I wouldn’t like being the one using this repository!).
Let’s draw the schema, as always:
Level : #0 #1 #2 #3 #4 Folder: SubFolders (Objects) --> SubFolders --> SubFolders --> SubFolders --> ... until 99 ContainedDoc --> Nothing ContainedDoc --> Nothing ContainedDoc --> Nothing ContainedDoc (Objects) --> Nothing we just want to call a delete()
This would be translated in FileNet as follow:
new FilterElement(99, null, null, PropertyNames.SUBFOLDERS, null); new FilterElement(99, null, null, PropertyNames.CONTAINED_DOCUMENTS, null);
Again, I see to often 0 or null as level recursion, and people think they are doing the right thing, but actually it will fetch place holders for the first level, and then each call to getSubFolders() and getContainedDocuments() will actually fetch all objects with all their properties. So yes, it means do this or don’t use any PropertyFilter is the same…