You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been working on rewriting some of the queries in Max's brain to work with the new database structure, and it's looking like it's going to require a rewrite of most of the queries. At this point, I'm trying to figure out how best to submit the changes here.
Couple of options I'm seeing:
Drop support for OG bloodhound, only support CE
Add some manner of specifying (or detecting?) which version of BH the database is based on
This has the side effect of making max even bigger and harder to maintain, since it's basically two code bases in one
Maintain a separate branch in the same repo (ie main for legacy and main-ce for bloodhound-ce support
Hard fork to a separate repo for CE support
Complete rewrite to try and make better use of the new database schemas / features (rather than just replacing the existing queries with as much as a like-for-like as possible)
Try to implement some kind of "make BH-CE compatible with Max.py" function, and keep the rest of the existing code base.
Looking around, doesn't seem to be a super clear "standard" approach. Posing this issue here to see what the feel is @knavesec
Technical details
Ok, so for some (I'm sure quite smart and not annoying at all) reason, BH no longer uses simple things like .high_value or .owned as clear attributes on a given node. In other words, doing queries like MATCH (u:User {owned:True}) no longer works. Instead, those attributes are shoved into a string delimited single field called system_tags. Basically, it maps like this: n.high_value = True -> n.system_tags = "admin_tier_0" n.owned = True -> n.system_tags = "owned" n.owned = True, n.high_value = True -> n.system_tags = "admin_tier_0 owned"
This, of course, makes queries a lot harder. Using the built-in quires in BH-CE as a reference, it looks like n.system_tags isn't even guaranteed to exist, and since it's a string attribute, not a boolean attribute, just calling CONTAINS isn't enough. Therefore, the basic structure of a simple query changes. (Note the addition of COALESCE to ensure that there's an empty string if n.system_tags doesn't exist.
Old:
MATCH (u:User{owned=True}) RETURNu
New:
MATCH (u:User) WHERECOALESCE(u.system_tags, '') CONTAINS'owned'RETURNu
This, of course, makes queries gross to write, and gets gross fast.
This gets even grosser when we want to start modifying things. For example, to mark something as owned (as with --mark-owned), we can no longer just do
MATCH (u.User{username=JOHN.DOE@EXAMPLE.COM}) SETu.owned=True
Instead, we have to do something like
MATCH (u:User{username=JOHN.DOE@EXAMPLE.COM})
SETu.system_tags=(
CASEWHENCOALESCE(n.system_tags, "") CONTAINS"owned"THENn.system_tagsELSEtrim(COALESCE(n.system_tags, "") +" owned" )
END
)
To break that down:
Match the user
Use a CASE to check if the user is already owned
If so, leave as is
Otherwise:
COALESCE with an empty string so we have a non-null
Append owned with a space, in case something already exists
- This prevents `u.system_tags = "admin_tier_0owned"
TRIM the string, in case it doesn't already have some tag
- This prevents u.system_tags = " owned"
The point
The point of these examples is that it's not quite as simple as pointing max at a new DB. I'm still working on getting all the queries rewritten and optimized, but it feels like we'll have to make some change here to keep max useful, since bloodhound legacy isn't getting updates anymore.
Open to ideas, thoughts, whatever. Max is a good boy, I don't want to put him down because someone changed the DB schema
The text was updated successfully, but these errors were encountered:
Just to give a personal feedback, I almost stopped using bloodhound legacy because CE version is mostly on-par with the legacy version feature wise as well as being more adapted to multi-project (using https://github.com/Tanguy-Boisset/bloodhound-automation).
I would say that dropping support for legacy bloodhound is fine. Maintaining the two branches seems like a lot of (mostly useless) effort.
I've been working on rewriting some of the queries in Max's brain to work with the new database structure, and it's looking like it's going to require a rewrite of most of the queries. At this point, I'm trying to figure out how best to submit the changes here.
Couple of options I'm seeing:
main
for legacy andmain-ce
for bloodhound-ce supportLooking around, doesn't seem to be a super clear "standard" approach. Posing this issue here to see what the feel is @knavesec
Technical details
Ok, so for some (I'm sure quite smart and not annoying at all) reason, BH no longer uses simple things like
.high_value
or.owned
as clear attributes on a given node. In other words, doing queries likeMATCH (u:User {owned:True})
no longer works. Instead, those attributes are shoved into a string delimited single field calledsystem_tags
. Basically, it maps like this:n.high_value = True
->n.system_tags = "admin_tier_0"
n.owned = True
->n.system_tags = "owned"
n.owned = True, n.high_value = True
->n.system_tags = "admin_tier_0 owned"
This, of course, makes queries a lot harder. Using the built-in quires in BH-CE as a reference, it looks like
n.system_tags
isn't even guaranteed to exist, and since it's a string attribute, not a boolean attribute, just callingCONTAINS
isn't enough. Therefore, the basic structure of a simple query changes. (Note the addition ofCOALESCE
to ensure that there's an empty string ifn.system_tags
doesn't exist.Old:
New:
This, of course, makes queries gross to write, and gets gross fast.
A more complex example, paths from owned to HVTs
Old:
New (This is directly from the BH examples):
This gets even grosser when we want to start modifying things. For example, to mark something as owned (as with
--mark-owned
), we can no longer just doInstead, we have to do something like
To break that down:
CASE
to check if the user is already ownedCOALESCE
with an empty string so we have a non-nullowned
with a space, in case something already exists- This prevents `u.system_tags = "admin_tier_0owned"
TRIM
the string, in case it doesn't already have some tag- This prevents
u.system_tags = " owned"
The point
The point of these examples is that it's not quite as simple as pointing max at a new DB. I'm still working on getting all the queries rewritten and optimized, but it feels like we'll have to make some change here to keep max useful, since bloodhound legacy isn't getting updates anymore.
Open to ideas, thoughts, whatever. Max is a good boy, I don't want to put him down because someone changed the DB schema
The text was updated successfully, but these errors were encountered: