Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cachefilter plugin needs improvements #1072

Closed
Namoshek opened this issue Nov 9, 2016 · 12 comments
Closed

cachefilter plugin needs improvements #1072

Namoshek opened this issue Nov 9, 2016 · 12 comments

Comments

@Namoshek
Copy link
Contributor

Namoshek commented Nov 9, 2016

As discussed in #829, the cachefilter plugin needs some changes.

Expected Behavior

The cachefilter plugin should always ensure that:

  • kdbGet() calls deliver what the user expects (i.e. the keys for the parentKey), no matter how often and on what kdbGet() was called before.
  • kdbSet() adds all cached keys to the keyset before allowing the backends to write them, so that no keys get lost.

@markus2330 is this also how you would think it should work?

Questions

  • How to deal with kdbGet() calls that fetch keys from multiple mountpoints? Use one cache or multiple (how to know what to put where)?
    E.g. mountpoints dir/test/mp1 and dir/test/mp2 and two successive calls kdbGet(ks1, "dir/test/mp1/something") and kdbGet(ks2, "dir/test/mp2/something").
  • What to add to the keyset in kdbSet() if keys of two mountpoints have been requested before?
    Ad. previous example: The cache will at this point contain all keys of the mountpoints that are not in the /something directory. When using kdbSet(ks1, "dir/test/mp1/something"), do we also need to (or even may) add the keys of dir/test/mp2 to the keyset (without breaking anything)?
  • How to allow for deletion of keys?
    • If the cache contains everything not returned by kdbGet(), successive kdbGet() calls on the same or child-keys won't return anything.
    • If the cache contains everything that all kdbGet() calls so far received from kdb, updating the cache is not possible in kdbSet() without knowning what previous kdbGet() calls returned and to what of those calls the kdbSet() call belongs to.
@Namoshek
Copy link
Contributor Author

Namoshek commented Nov 9, 2016

I'm testing my implementation with my REST service. The problem I face is that I'm using two basic mountpoints, namely dir/users and dir/configs. If I use the same KDB connection for both, it will delete the whole dir/users mountpoint content whenever I store a snippet or the other way round, even though everything is added to kdbSet(). I guess I can't fix this behavior for multiple mountpoints?

Example

Keys in use

dir/users/Namoshek
dir/configs/some/example/config/path
dir/configs/some/other/config/path

On start of service

the two mountpoints

dir/users
dir/configs

are read with kdbGet(), which will place all above keys in the cache.

During run-time

dir/configs/some/example/config/path

is successfully fetched (+ updated) and merged with the cache during kdbSet(), but then dir/users gets deleted although the entire cache gets returned from kdbSet().

@Namoshek
Copy link
Contributor Author

I somehow have the feeling the plugin doesn't get called at all for successive kdbGet() calls. Can this have to do with the placement of postgetstorage?

@markus2330
Copy link
Contributor

Expected Behavior

I am afraid that more informal descriptions do not really bring us further. It looks good what you write, but it is vague. The next step are test cases which describe the desired behavior.

How to deal with kdbGet() calls that fetch keys from multiple mountpoints? Use one cache or multiple (how to know what to put where)?

I do not so a reason why you cannot be agnostic to the actual mountpoints. Did something not work with one cache and always adding all keys?

What to add to the keyset in kdbSet() if keys of two mountpoints have been requested before?

It should be safe to add more keys in unrelated backends (see splitBuildup: backends outside of parentKey are not in Split and thus irrelevant for further logic).

it will delete the whole dir/users mountpoint

Cannot reproduce, please open a separate issue for this with commands/test case to reproduce it. I tried:

$ kdb mount a.dump /a
$ kdb mount b.dump /b
$ kdb set /a/valueable_data
$ kdb set /b/valueable_data
$ kdb shell
> kdbGet /a
return value: 1
> kdbGet /b
return value: 1
> keySetName user/b/newkey
> ksAppendKey
> kdbSet /b  
return value: 1
$ kdb ls /a
user/a/valueable_data
$ kdb ls /b
user/b/newkey
user/b/valueable_data

I somehow have the feeling the plugin doesn't get called at all for successive kdbGet() calls. Can this have to do with the placement of postgetstorage?

Yes, postgetstorage is only called if kdbGet() has something to do. You need postgetstorage/deinit (which should be always called). @mpranj Is postgetstorage/deinit already available in your repo? Can you create a PR?

The problem is in src/libs/elektra/kdb.c line 760: If no update is needed, kdbGet immediately exits without calling POSTGETSTORAGE/POSTGETCLEANUP hooks.
@Namoshek As workaround you can copy the lines 828-831 before the line 760. So something like (but ask @mpranj before adding to a PR, it might create conflicts):

diff --git a/src/libs/elektra/kdb.c b/src/libs/elektra/kdb.c
index 852f97c..81c5f48 100644
--- a/src/libs/elektra/kdb.c
+++ b/src/libs/elektra/kdb.c
@@ -752,6 +752,11 @@ int kdbGet (KDB * handle, KeySet * ks, Key * parentKey)
        switch (elektraGetCheckUpdateNeeded (split, parentKey))
        {
        case 0: // We don't need an update so let's do nothing
+               if (handle && handle->globalPlugins[POSTGETSTORAGE])
+               {
+                       handle->globalPlugins[POSTGETSTORAGE]->kdbGet (handle->globalPlugins[POSTGETSTORAGE], ks, parentKey);
+               }
+
                keySetName (parentKey, keyName (initialParent));
                splitUpdateFileName (split, handle, parentKey);
                keyDel (initialParent);

@Namoshek
Copy link
Contributor Author

Did something not work with one cache and always adding all keys?

Obivously nothing works because of the wrong placement, but as I had this idea pretty late I tried a lot of things that did not work before already. But I guess using one cache should work if I use following strategy:

  • kdbGet:
    • append the returned keyset to the cache (brings cache up to date)
    • set returned to ksCut (cache, parentKey) but also re-add the cut keys to the cache for successive kdbGet() calls
  • kdbSet:
    • cut the parentKey out of the cache and add the returned keyset to the cache (allows for deletes)
    • put the whole cache into the returned keyset

Cannot reproduce, please open a separate issue for this with commands/test case to reproduce it. I tried:

Forget about it; has also to do with the not correctly working cache.

@mpranj Is postgetstorage/deinit already available in your repo? Can you create a PR?

That would be great, yes.

@Namoshek
Copy link
Contributor Author

I tried it with your suggestion in kdb.c, but it doesn't seem to work either. The cachefilter plugin is listed below system/elektra/globalplugins/postcommit/user/plugins/#1, which seems to be wrong? But the placement in the README is correct...

@markus2330
Copy link
Contributor

With this information it is hard to tell what the cause is, can you open an PR demonstrating the issue?

It might be:

  1. plugin not added correctly (did you try kdb global-mount)?
  2. README.md updated without recompiling (cmake is not called automatically when README.md are changed). I added cmake: changes README.md do not trigger rebuild #1075 for this.
  3. You are in some error state.

Did you try to run with logger/debugger?

@Namoshek
Copy link
Contributor Author

Namoshek commented Nov 10, 2016

Yes, I tried everything mentioned above as far as I can tell. I can create a demo PR tomorrow.

README.md updated without recompiling

Can't be the case actually as I've never changed it and it is already part of the master branch.

@mpranj
Copy link
Member

mpranj commented Nov 11, 2016

I could create a PR with postgetstorage/deinit, yes.

@Namoshek
Copy link
Contributor Author

Will you make a PR in near future @mpranj or shall I try to change the hooks for my purpose on my own? Personally I don't really need the change, but it would improve performance and I could probably get rid of my own cache implementation. But this change can be done later as well.

@mpranj
Copy link
Member

mpranj commented Nov 14, 2016

I'm still writing a test plugin to ensure that it works properly.

If you're in a hurry, I could make a PR without the tests.

@Namoshek
Copy link
Contributor Author

Namoshek commented Nov 15, 2016

I don't know what exactly you want to test, but maybe my plugin could be (partially) used for tests as well?

@markus2330
Copy link
Contributor

Closed due to removal of cachefilter plugin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants