Efficiently Computing Permissions at Scale—Our Engineering Approach

Efficient calculation of large-scale authorizations: our technical approach

Efficient calculation of large-scale authorizations: our technical approach

Efficient calculation of large-scale authorizations: our technical approach

Eugene Nelou

Eugene is a staff engineer at Gitguardian.

He likes to build tools that benefit everyone

and work to improve the quality of GitGuardian products.

A few weeks ago, we introduced a new role-based access management (RBAC) feature to the GitGuardian internal monitoring platform. This release is the result of several months of hard work during which we had to thoroughly revise our data model and implement a very resource-efficient authorization calculation mechanism. I thought this was the perfect opportunity to offer a deep dive into the research, issues, and dead ends we encountered on this journey.

Disclaimer: I will use Django in my code examples, but the ideas can be generalized; however, a relational database is a stronger requirement.

I. Define the problem

In a nutshell, the RBAC functionality creates the notion of “teams”, a perimeter where each member can see and act on a limited number of incidents. In our domain, an incident is a logical unit corresponding to a unique leaked secret. Since a secret can leak in multiple repositories, we call occurrences the different locations of this secret in one or more repositories. A set of repositories defines a team, so a user belonging to a team can act on any secret detected one or more times on any of these repositories.

Since an incident can have two occurrences held by two different teams, our first conceptual problem was: how to distribute the incidents between the teams?

💡

Note: Attaching repositories directly to a team is by no means the only possibility. We might decide, for example, to assign an entire GitHub organization to a team so that repositories subsequently created in that organization are automatically added to the team scope. But this implementation is outside the scope of this article, so we’ll assume we have a direct link between Teams and Repositories.

But that’s not all. We also needed to enable the ability to give access to a particular incident to a user or team. A user has his own perimeter, which is the union of the perimeter of his teams, and the incidents to which they have access individually.

Here is a visualization to help you grasp the relationships between these concepts:

Efficient calculation of large-scale authorizations: our technical approach
Class diagram for our models

Finally, knowing which incidents fall within the user’s scope was only half the story; what we ultimately wanted was to know what the user could do with it. Here are the three permission levels:

  • READ: the user can see the incident
  • WRITE: the user can act on the incident: ignore, assign, solve, etc.
  • ADMIN: the user can share the incident with other users and teams, adding it to their scope.

These permissions can, again, be inherited from the team the user is a member of or directly assigned. For a given incident, the user’s permission is therefore the maximum level of authorization granted by these two means. And this maximum permission must be calculated dynamically (on the fly).

Why we didn’t go for the simple solution

A simple solution would be to have a persistent table per user permissions. But it would be very difficult to maintain. Why? Let’s say a user is removed from a team. Incidents for which he had inherited certain permissions while belonging to the team are no longer in the user’s scope. Therefore, all permissions for Team Incidents should be recalculated, to check if the user has lost access or had their permission reduced on Incidents.

Going with a table of permissions per user would necessarily imply an order of magnitude higher in the number of operations needed to update all user permissions.

Since we wanted to keep table operations as synchronous as possible, we added the permission fields across three relationships to spread the workload:

  • the User-Incident relationship
  • the Team-Incident relationship
  • the User-Team relationship

After doing some research, we decided to calculate these permissions in SQL. Not relying on per-user permissions also meant we couldn’t rely on common Django permission libraries (including django.contrib.auth), all of which are object-based.

In the table below, we map the number of rows impacted by a new event (new incident, new repo added to a team’s scope, etc.). We can see that the solution per object scales linearly with the number of users in a team. But we don’t want the size of our teams to be limited:

condition # affected user incident number of team incidents affected
new incident # of teams × # of team users # of teams
new reference in the team # of repository incidents × # of team users # of repository incidents
new user in the team # of team incidents 0
new team incident (direct access) # of team users 1

Although we dismissed the User-Issue relationship early on as the ultimate source of truth, we had to use per-object permissions for the Team-Issue relationship. This choice was motivated by performance reasons: the read operation through the Deposit and Occurrence tables was too slow, and we assumed that the number of teams would be less than the number of users.

II. How our model works

A simple trick: use binary masks

Once we defined the permission specifications, we needed to figure out how to store them in our database. I mentioned three permission levels, but it was obvious that in the future we needed to add many more to allow for more granularity in business domain roles. To avoid having many Boolean fields and to simplify the authorization checking logic, we preferred to store the authorizations in their binary representation. Through the use of binary masks, we can store all permissions in a single Integer field.

💡 How to check permissions stored as bitmask
Let’s say we have 2 resources A and B, and READ and WRITE permissions
We will store this in two bytes. Let’s assume for simplicity that WRITE implies READ,
Case for A:

  • 0b0011 is the WRITE: A permission
  • 0b0001 is the READ: A permission

Case for B:

  • 0b1100 is the WRITE: B permission
  • 0b0100 is the READ: B permission

and obviously :

With a little ANDwe code for example 0b0111 as being the WRITE: A and READ: B permission. Conversely, to check a permission, just do a bit AND on the permission mask and the binary value of the field.

So to check if a user has permission WRITE: Awe’ll do 0b0011 & user authorization. The result will be equal to the mask only if the user has the permissions:

  • 0b1111 & 0b0011 = 0b0011 → OK
  • 0b0111 & 0b0011 = 0b0011 → OK
  • 0b1101 & 0b0011 = 0b0001 → disagree
  • 0b0000 & 0b0011 = 0b0000 → disagree

To implement this in Django, we used the IntegerChoices classes, as well as a simple helper to help check permissions in our Python code.

from django.db import models

class Permission(models.IntegerChoices):
    READ = 0b001
    WRITE = 0b011
    ADMIN = 0b111

    @classmethod
    def is_authorized(
        cls, mask: "Permission", scope: "int | Permission"
    ) -> bool:
        """
        GIVEN a mask and a scope
        Return true if the scope matches the mask
        ex: 0b100 & 0b110 = 0b100 != 0b110
        """
        return bool((scope & mask) == mask)

Django models

Now that we know the relationships between our objects and where to store the permissions we need, we can implement it with Django models.

Let’s say we are using the default Django User model, here are our models:

class TeamUser(models.Model):
    team = ForeignKey("Team", ...)
    user = ForeignKey("User", ...)
    permission = PositiveSmallIntegerField(default=Permission.READ)

class Team(models.Model):
    name = TextField(...)
    users = ManyToManyField("User", through="TeamUser", ...)

class TeamIncident(models.Model):
    team = ForeignKey("Team", ...)
    incident = ForeignKey("Incident", ...)
    permission = PositiveSmallIntegerField(default=Permission.READ)

class UserIncident(models.Model):
    user = ForeignKey("User", ...)
    incident = ForeignKey("Incident", ...)
    permission = PositiveSmallIntegerField(default=Permission.READ)

class Incident(models.Model):
    name = TextField(...)
    teams = ManyToManyField(Team, through="TeamIncident", ...)
    users = ManyToManyField(User, through="UserIncident", ...)

Simple enough, let’s get to the use cases.

Filter incidents for a user

First of all, getting all incidents for a user, or all users with access to an incident is simple, because the existence of the models themselves implies READ permission, so we don’t have to check the permissions. We can do the following:

# list incidents of user
Incident.objects.filter(Q(users=user) | Q(teams__users=user)).distinct()

# list user having access to an incident
User.objects.filter(Q(incidents=incident) | Q(teams__incidents=incident)).distinct()

💡

Distinct is necessary because a user may be authorized to access an incident through multiple lines.

The query could be done via subqueries instead. In practice, we take advantage of the fact that we already have access to the user’s teams to simplify it.

After checking which incidents to display to a user, we want to know what permissions they have on those incidents to know what actions they are allowed to do.

Let’s stay with three levels of authorization:

  • 0b001 is READ which allows to see the incident
  • 0b011 is WRITE (involving READ) which allows to act on the incident
  • 0b111 is ADMIN (involving READ + WRITE) that grants access to the incident to other users and teams.

And of course, 0b000 is not allowed at all.

Let’s write the Django query for this, constructing the user_permission annotation that will contain the user’s aggregate permission on each incident.

A user’s permission within a team is the lower permission (calculated with the binary AND operation) between the team’s permission in the incident and the user’s permission in the team:

F("team__team_incident__permission").bitand(F("team__team_user__permission"))

# and filter the relation by
queryset.filter(team__team_user=user)

And a user’s permission across multiple teams is the highest permission (calculated with the binary OR operation) for all teams:

BitOr(
    F("team__team_incident__permission").bitand(F("team__team_user__permission")),
    output_field=PositiveSmallIntegerField(),
)

But the user can also access incidents individually, so we will use Coalesce(..., 0) which will replace null values ​​with 0, our null permission, when the user does not have access through teams or individually. Otherwise, we couldn’t apply our binary operation (NULL is not a binary value).

user_permission_expression = Coalesce(
    BitOr(
        F("team_incident__permission").bitand(F("team_incident__team__team_user__permission")),
        output_field=PositiveSmallIntegerField(),
    ),
    0,
).bitor(Coalesce(F("user_incident__permission"), 0))

Finally, we filter the query set for our user:

queryset = Incident.objects.filter(
    Q(user_incident__user=user) | Q(team_incident__team__team_user__user=user)
).annotate(user_permission=user_permission_expression).distinct()

Filtering a Request Set by Authorization

We have everything we need, but it’s not yet practical to fetch all of the user’s objects for which they have some level of permission with our binary logic.

We could create a custom queryset filter, but let’s make something more reusable: let’s define a custom search to implement the Permission.is_authorized method directly in SQL:

class IsAuthorized(Lookup):
    """
    GIVEN a mask and a scope
    Return true if the scope matches the mask
    ex: 0b100 & 0b110 = 0b100 != 0b110
    """

    lookup_name = "isauthorized"

    def as_sql(self, compiler, connection):
        lhs, lhs_params = self.process_lhs(compiler, connection)
        rhs, rhs_params = self.process_rhs(compiler, connection)
        params = lhs_params + rhs_params + rhs_params

        # The binary operation happens here
        return "%s & %s = %s" % (lhs, rhs, rhs), params

Field.register_lookup(IsAuthorized)

# usage, assuming the of_user queryset method annotates the user_permission
Model.objects.of_user().filter(user_permission__isauthorized=Permission.WRITE)

It’s important to note that while our calculation of incident permissions works in all cases, we shouldn’t forget about shortcuts.

For example, the Manager The role grants access to all incidents, so it doesn’t make sense to calculate the permissions for it. Similarly, the “All Incidents Team” provides access to all incidents in the organization, allowing us to eliminate the scope calculation.

Also, in paginated endpoints, we only have to calculate the permissions on the page we want to return!

Had finished!

Setting up the Teams feature was far from straightforward, and I know we’re not the first engineering team to face this kind of challenge. This required careful thought about the data models we use and how to implement the feature with the least possible impact on both performance and the rest of the application. In the end, I think it was a very good exercise and we learned a lot of things that we can apply to other parts of our code.

It’s time for our next challenge!

*** This is a syndicated blog from the Security Bloggers Network of the GitGuardian Blog – Automated Secret Detection written by Guardians. Read the original post at: https://blog.gitguardian.com/effectively-computing-permissions-at-scale-our-engineering-approach/

#Efficient #calculation #largescale #authorizations #technical #approach

Leave a Comment

Your email address will not be published. Required fields are marked *