Overview
This is the second part of our series on OpenSearch performance improvements. After having reviewed benchmarks of the achieved improvements in
part one, we will dive now a bit deeper into the tech details: How have the improvements been achieved? What special data structures are being used?
We will cover now the privilege evaluation process for basic action requests in OpenSearch. Another interesting area is the rule evaluation for document level security, field level security and field anonymization; we will cover that in a follow up article.
Note: In this article, we will sometimes use the
Big O notation to reason about computational complexity. If you are not familiar with it: Actually it is pretty simple, one can say on a high level:
O(1) is amazing,
O(log n) is nice,
O(n) is okay-ish,
O(n²) is bad, and anything beyond is horrible.
A look inside: The new algorithms and data structures for action privilege evaluation
Action privilege evaluation is the most important part of the work. Every API request issued to OpenSearch boils down to at least one action request, which needs to be checked for sufficient authorization. Some requests - like bulk index requests - are internally decomposed into several action requests which are distributed over the cluster. Then, each request requires separate authorization checks. Thus, it is clear that an efficient access control implementation is crucial here.
For achieving optimal performance, we have designed a new data structure which forms the core of the optimized privilege evaluation. When designing the data structure, we applied a number of fundamental strategies in order to ensure optimal results:
- Use of fast data structures: hash tables give optimal performance with O(1) computational complexity
- Use of denormalized data structures: The role configuration allows very compact specification of privileges by using pattern expressions on indices and actions and by using grouping. However, doing privilege evaluations directly on this data structure is not too efficient. In such cases, denormalization is used to achieve maximum performance. We expand the role configuration into a big denormalized data structure by resolving index and action patterns to concrete actions and “multiplying out” the index/action grouping.
- Use of as few loops as possible: The most perfect performance would be gained if the privilege evaluation could be done without any loops. Then, one would have a true O(1) complexity. In reality, it is rarely possible to avoid loops altogether, but minimizing them is always a great idea. If a loop cannot be avoided, one should be sure to minimize the total number of iterations spent inside the loops.
- This leads us directly to the next strategy: Reduce the problem space as quickly as possible. Identify the most selective criteria and apply these first. This helps with reducing the number of necessary iterations and also with the performance of hash tables.
- Finally: One downside of denormalization is: It can create quite a lot of data. In order to keep the amount of data manageable, we had again turn to techniques which reduced the amount of data to reasonable numbers.
In the OpenSearch code, the data structure can be found in the class
ActionPrivileges
. If you want you can read along in the code while reading this article. The code provides many detailed comments.
Action Types
In OpenSearch, privileges for actions can be divided into several major categories:
- Index actions: Privileges for actions can be assigned on an index-specific level; example: user A has privileges for
indices:data/read/search
on the indices index_a1
and index_a2
.
- Cluster actions: Privileges for actions that can be only configured on a cluster-global level; example: user B has privileges for
cluster:monitor/health
.
- There are also further categories like tenant actions, but these are not so important here, so we skip them.
-
Cluster Actions
Let’s start with cluster actions, as they are the more simple case. When evaluating privileges for a cluster action, you basically have to answer the question: “I have action type A and a user with effective roles [x,y,z]. With the present role configuration, does that user have authorization to execute the action?”.
A role configuration for cluster actions can look like this:
copyexample_role_1:
cluster_permissions:
- cluster:admin/tasks/*
- cluster:admin/settings/update
example_role_2:
cluster_permissions:
- indices:data/write/bulk
- indices:data/read/scroll*
example_role_3:
cluster_permissions:
- cluster:admin/tasks/cancel
- cluster:monitor/*
Do not get confused by the names of the permissions in example_role_2
: Also actions which have a name starting with indices:
can be cluster permissions. Sometimes, these actions are just independent of concrete indices, sometimes index privileges are checked on sub-actions spawned by that particular action.
Now, let’s suppose we have to check an incoming action request cluster:monitor/main
from a user which has the effective roles example_role_1
, example_role_2
, example_role_3
.
A naive algorithm could work similar to this:
- Loop through all effective roles of a user
- Retrieve
example_role_1
from the role configuration
- Loop through all
cluster_permissions
from the example_role_1
configuration
- Perform a pattern match of
cluster:admin/tasks/*
against cluster:monitor/main
- Compare
cluster:admin/settings/update
against cluster:monitor/main
- Retrieve
example_role_2
from the role configuration
- Loop through all
cluster_permissions
from the example_role_2
configuration
- Compare
indices:data/write/bulk
against cluster:monitor/main
- Perform a pattern match of
indices:data/read/scroll*
against cluster:monitor/main
- Retrieve
example_role_3
from the role configuration
- Loop through all
cluster_permissions
from the example_role_3
configuration
- Compare
cluster:admin/tasks/cancel
against cluster:monitor/main
- Perform a pattern match of
cluster:monitor/*
against cluster:monitor/main
- Yes, we have a match! Allow the incoming action request to proceed
You can see, there is quite a lot of looping and processing going on. One of the optimization strategies we named above was to minimize looping as much as possible. So, how can we do that?
We have two parameters we have to consider for the privilege evaluation: The action and the effective roles of a user. The effective roles is a set of strings, so for evaluating that - we need to loop. On the other hand, the action is just a single value. That might be promising? So, is there a way to use the action name in a way that avoids loops?
Hash tables are an amazing data structure. The concept is incredibly simple - yet, they have very good performance characteristics. They promise a O(1)
computational complexity for looking up values contained in the table. Could we use a hash table to make a lookup based on the requested action?
We could if we put concrete action names into the hash table. However, our role configuration contains patterns like cluster:monitor/*
. These cannot be used for a direct lookup.
One data structure which supports look-ups on string patterns is - to a certain extent - the
trie. However, this only works for very simple patterns. Unfortunately, OpenSearch also supports more complex patterns using regular expressions. These could not be used any more for a trie.
So, let’s go back to hash tables: Is there a way to use them? In an OpenSearch cluster, all possible actions are known up-front. With this knowledge, we can apply denormalization on the action patterns in the role configuration to “multiply out” the patterns into actual action names.
Thus, we can now build a hash table to do a lookup based on an action name. The lookup then yields another piece of data. We can use this to store the names of the roles which provide the respective privilege. The resulting data structure would look similar to this:
On the left, we see a hash table which has actions as keys. These actions are then mapped to hash sets of role names which provide these privileges.
With the new data structure, the privilege check algorithm would now look like this:
- Look up the action in the hash table and retrieve the set of roles.
- Check whether the intersection of the user’s effective roles and the set of roles providing the privilege is non-empty.
For the latter check, some looping might be still necessary. However, in the concrete example, this can be checked with a single iteration.
It is obvious that the new algorithm is much simpler and thus faster than the naive one. By using the action as first criterium to start the privilege evaluation algorithm, we could greatly reduce the problem space and thus minimize the necessary iterations
Index Actions
For cluster actions, the new concept is quite simple. When looking at index actions, we have to consider a couple of more requirements and cases. Index actions are actually the area where the new data structures unfold their full potential and bring the greatest performance improvements.
When evaluating privileges for index actions, you have to answer the question “I have a set of action types, a set of indices and a set of effective roles. With the present role configuration, do I have sufficient privileges? If I do not have sufficient privileges, is there a subset of indices for which I do have sufficient privileges?”
A role configuration for cluster actions can look like this:
copyexample_role_1:
index_permissions:
- index_patterns:
- index_a*
- index_b*
allowed_actions:
- indices:data/read/*
- indices:admin/resolve/index
example_role_2:
index_permissions:
- index_patterns:
- index_b*
allowed_actions:
- indices:data/write/*
example_role_3:
- index_patterns:
- alias_ab
allowed_actions:
- indices:data/read/*
Compared to the cluster privileges, we gained an additional dimension: The index patterns. Index patterns can not only refer to individual indices, but - as we see in example_role_3
- also to aliases, which will in turn refer to further indices. It is also possible to specify data streams here. If aliases or data streams are specified here, these are supposed to inherit their privileges to the indices they are containing; e.g., considering the above role configuration example, if alias_ab
contains the indices index_a1
, index_a2
and index_b1
, example_role_3
will effectively yield the privileges to these indices.
A naive implementation of a privilege evaluation algorithm for indices would be similar to the one we discussed in the cluster privilege section - with one or even two further levels of nested loops dedicated to matching indices. One particular challenge is that we have index patterns on both sides: The user can send an action request which specifies a set of indices with index patterns. And the privilege configuration defines sets of indices using index patterns. Additionally, back-and-forth resolution of aliases and data streams is necessary. Implemented naively, this can have a complexity of O(n²).
Let’s now try to extend the data structure we have developed for the cluster privileges to support also index information. It is obvious that we need to add an additional mapping level. There we have two options; we could use an index → action → roles mapping; we could also use an action → index → roles mapping.
For deciding which mapping order is the best, let’s look at the selectivity of the action and index parameters. Let’s have a look again at the question to be answered: “I have a set of action types, a set of indices and a set of effective roles. With the present role configuration, do I have sufficient privileges?”* So, we have both a set of indices and a set of action types.
- The set of indices is defined by the user which triggers the action request. Action requests - such as search requests - can refer to a set of indices. It is equally possible that these refer to a single index or that these refer to hundreds of indices - by using index patterns.
- The set of action types is different. Actually, it is a bit surprising that we have a set of action types and not only a single one. The action type should be emergent from the action request - and an action request can have exactly one type. However, there are a couple of special cases. Some action requests can carry an amazing amount of functionality: Alias actions have the surprising property of also allowing to delete indices. For such cases, the OpenSearch security implementation requires the user to have additional action privileges to perform an action. But that means: Usually, we have exactly one action type. In some cases, we might have two or more action types to consider in a privilege evaluation.
So, we can say that the set of action types is always very small - usually it will consist of only one action type. Meanwhile, the set of indices can be quite big.
Thus, applying the strategy of reducing the problem space as quickly as possible, we should keep the actions on the first level of the data structure. Then, on the second level, we can have an additional layer of hash tables which map indices to sets of roles which provide privileges for the particular combination of actions and indices. The indices are also denormalized - that means index patterns are resolved to concrete indices and aliases are resolved to the member indices.
The resulting data structure looks like this:
As you can see, one effect of denormalization is quite a lot of duplication. This is certainly a downside of the concept. We will talk about this more in detail later.
Still, the new data structure enables to perform very fast privilege evaluation. The new algorithm now looks like this:
- Look up the action in the hash table and retrieve the secondary hash table which maps indices to roles.
- Look up the requested index in the secondary hash table and retrieve the set of roles.
- Check whether the intersection of the user’s effective roles and the set of roles providing the privilege is non-empty.
Limits of Denormalization
The application of denormalization also has a couple of drawbacks and limits. The most important issues are the following:
-
Explosion of space instead of explosion of computation time: You can see in the above diagram that the resulting data structure for indices grows quite big - even if there are only few indices and roles involved. We performed some
tests to learn about the RAM which might be potentially required by the data structures. An example configuration with 100 roles applied to 100 indices would consume about 20 MB of RAM - using the hash table implementations provided by the core JDK. 1000 roles on 1000 indices would need about 2 GB of RAM - which is prohibitively large. We need strategies to reduce the space requirements.
-
Increased upfront computation time: Building such a big data structure also can take a substantial amount of time. Thus, we also need strategies reduce the computation time. Additionally, it would be good if the cluster can continue its normal operations even while the data structure is being updated - the data structure will need to be updated whenever an index is created or when the role configuration is changed.
-
Incompatibility of certain role configurations with the pre-computed data structure. OpenSearch allows you to use user-specific attributes as part of index patterns which are used in the roles configuration. These user-specific attributes are usually not known at the time the denormalized data structure is being built. These attributes only become available as soon as the user is authenticated. Thus, we cannot do a pre-computation for such role configuration.
-
Operations on non-existing indices are also an issue. Of course, we can only create the pre-computed data structure based on existing indices. On first glance, an operation on a non-existing seems weird. But there’s a very straight forward action which operates on non-existing indices and which still needs proper access controls: The create index operation. Access controls for this action cannot be implemented using the pre-computed data structure.
So, in short, we need strategies to cut down space requirements and to handle cases in which the denormalized data structure cannot be used.
Reducing Space Requirements
We have applied a number of different strategies in parallel to reduce the space requirements of the denormalized data structure.
- Deduplication of the hash sets containing role names. If you look at the diagram of the data structure above, you will see quite a bit of duplicated information. This is just the consequence of denormalization. However, in order to cut down the space requirements, it is actually possible to recognize the duplicated sets and re-use them. This changes the data structure from being tree-like to being formed like a directed acyclic graph. This change does not affect the algorithm that evaluates the data structure, it can be used without changes. The deduplicated version then looks like shown in the following diagram:
-
Compression of the set data structure. We know we will have possibly a big number of sets of role names - which all are sub-sets of another set; that is the set of role names which are present in the configuration. Additionally, we need a fast
O(1) contains operation. Of course, classical hash tables provide that runtime complexity. However, using the knowledge that we have a larger number of sub-sets of a fixed super-set, we can also construct another data structure which provides the same runtime complexity: We can use a hash table where we map the super-set of role names to ordinal numbers from 0 to
n (where
n is the number of different role names). With this hash table, we can construct the sub-sets of role names using bit fields. We just have to use variable bit-length words of length
n. If bit
x is set this means that the role with ordinal number
x is contained in the set. Thus, the resulting sets will usually require only a few bytes in the typical cases. For standard hash sets, however, one would have to expect at least 160 bytes even for very small sets. Both deduplication and compression is achieved using a
dedicated library.
-
Reduction to essential actions. OpenSearch defines quite a lot of action types. About 50 of them are considered as index actions. Many however are only scarcely used for maintenance operations. Out of the 50 index actions, we have identified 16 index actions which are really performance critical, because they are essential either to ingestion or data retrieval. So, we create the denormalized data structure only for these 16 index actions. The performance critical action types are listed in the class
WellKnownActions
. As an additional benefit, this reduction also helps with cutting down the time required to build the denormalized data structure. If you have an additional action which requires maximum performance, you can file a PR to extend this list.
-
Reduction to essential indices. The data structure gives backing indices of data streams a special treatment. Due to the nature of data streams, they can produce quite a lot of backing indices. However, these should be usually not directly addressed - all operations should go via the data stream itself. Thus, we do not store backing indices of data streams in the data structure. Rather, we store the data stream itself. In the case backing indices are directly requested, we can use their metadata to recognize them as such and alternatively check the privileges on the data stream itself.
These measures help us to cut down the heap size drastically. The scenario of 1000 indices and 1000 roles - which would have required 2 GB with the basic denormalized data structure - now only requires about 112 kB.
Secondary Privilege Evaluation Algorithm
We saw that the denormalized data structure cannot be used in a couple of cases - for example, when the particular requested index does not exist yet. For these cases, we have a second privilege evaluation algorithm, which operates on an optimized, but not denormalized data structure. Thus, this algorithm still needs to evaluate index patterns while processing action requests. This is of course not as fast as the main privilege evaluation algorithm, but it can cover all possible cases.
The secondary privilege evaluation algorithm will be used in these cases:
-
The index is not known to the denormalized data structure. This is either because the index does not exist (yet) or because the denormalized data structure is outdated. It might be outdated due to asynchronous updates, see the upcoming section for details on this.
-
The action is not performance critical and thus not contained in the pre-defined well-known actions.
-
The role configuration contains user attributes and a previous check of the denormalized data structure could not find roles which grant access to the particular index.
Asynchronous Updates
The denormalized data structure needs to be rebuilt whenever a new index is created or when the role configuration changes. This is a process which can take a couple of seconds for clusters with greater amounts of indices or more complicated role configurations. It would be not really feasible to hold all operations until the data structure is rebuilt.
Rather, the new implementation builds the denormalized data structure asynchronously - as long as the new version is not ready, the previous version is being used. As explained above, the implementation can handle the case that the denormalized data structure does not know an index - then it will automatically fall back to the secondary privilege evaluation algorithm.
What Else?
In this article, we have presented an overview of strategies and techniques we have used to achieve huge performance gains for OpenSearch action processing. By the way: From the outside, users should experience no changes at all - expect hopefully a more snappy OpenSearch.
There’s one detail which can be seen in the OpenSearch logs: Whenever a user issues a request for which they have insufficient privileges, the new implementation will write a nicely formatted matrix into its internal logs which shows which indices are missing which privileges. This looks like this:
copy | indices:data/read/search |
index_a11 | ok |
index_a12 | MISSING |
index_a13 | MISSING |
index_a14 | MISSING |
We have not covered the improvements for DLS/FLS yet. There will be a follow-up article on this.
But, this is still not the end of the story. We have identified further potentials and have already first concepts to use them. These include:
- An optimized index resolution algorithm: Action requests which contain index patterns must be resolved against the existing indices in order to pass through privilege evaluation. At the moment, this algorithm requires many iterations and does sometimes even unnecessary resolutions. This can be optimized.
- Optimization potentials through the use of an immutable user object. Immutability of objects is a great concept to simplify many programming problems. Immutable objects can be used concurrently without locks or other synchronization mechanisms. Additionally, immutable user objects will have also an immutable serialized representation - this fact can be used to achieve performance gains in the inter-node communication.
So much for the optimized privilege evaluations. If you have any thoughts or questions, do not hesitate to contact us!