New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High level C-API error handling #1359
Comments
Thank you for your detailed analysis! A short answer for now, longer later:
Unfortunately that is not really the case because possible errors are not only dependent on the API functions. If the user has build-in spec keys, he/she has the guarantee to not get an error for getters like
Can you elaborate on which advantage this has? (Except of the kdbhlOpen case: but actually we should never fail to open except in completely broken installations, and maybe not even there, see below).
Of course the API needs to be extensible, but we need to decide about which paradigm we want to have now. I like the idea to collect everything that happened within the KDBHL handle. Then even sloppy API users would not miss any error.
Yes, why not do that? With build-in spec-keys an application will work correctly even if Btw. we should also consider better names than @tom-wa Can you also share your thoughts? |
Great if it can be guaranteed that getters won't fail, then we can just remove the error parameter and return the int instead. But what if a plugin fails (can a plugin be removed/changed at runtime)? What if the passed in key, e.g. "/myKey1" does not exist and there is no default value defined in the spec? What would be the return value then?
Separation of concerns and thread safety. If there is a KDBHL handle, as a user I would expect it to be a valid one. Why would the handle store an error? With a separate error object, the KDBHL handle could potentially be used on different threads (in future versions) which would be impossible if the handle stored one error only (for all threads).
Honestly, I think extensibility is one of the things that make an API great. ;-) So we should think about what would be possible and what would not be possible if we implemented version 1.0 of the API in a certain way.
What do you mean by "build-in spec keys"? If it is impossible that kdbOpen fails then we can remove the error parameter from kdbOpen and always return a handle of course. But is it really impossible? In doc/decisions/high_level_api.md you stated as comment to kdbOpen: "// might fail, you need to check for error afterwards!".
What if we just rename KDB to KDBCore (or similar...) and KDBHL to KDB? I think this might make sense, since as you mentioned, KDBHL should be the default API that people will use (instead of the current KDB). As a user I would expect the default API to be named like that. |
Obviously the spec should contain a
If KDBHL is not thread safe, there is no way to use the API concurrently and the separated error does not yield a benefit.
In some sense it is a valid one, we could even try to reopen on the next
We should define concurrency paradigm/semantics now. If we want to do it in a thread-safe way, there is no reason to not do it now. Then we do not need a handle at all and could use global variables. (I would not recommend to do that, but it is like it is currently done in lcdproc) Btw. it is not impossible to have a concurrent KDBHL even if the error is stored within KDBHL: It could be stored in a thread-local variable. But again: Such semantics cannot be changed later, we need to fixate them now (or till release).
Yes, absolutely, I fully agree! (Thus I clarified my statement later) But changing threading semantics/paradigms is not about extensibility. Either we make it multi-thread safe or not. (Reason: Unfortunately, it is in general not possible to go from a sequential model to a concurrent one and many decisions depend on if we make the API concurrent.)
If it is impossible or not needs to be defined by us; it is not something given. The idea of this comment is to have a spec-checker integrated within
For me it is yet unclear if there is such a thing as a default API or if we need to provide multiple high-level APIs for different use cases. (like easy-to-use vs. multi-core-efficient; and key/value vs. structured data access). |
Thanks for the discussion! I think for a first version it is best if we try to detect all issues at open, then we can assume that the other get-API calls do not yield errors. |
How do we handle the "key doesn't exit" case? |
Do you mean "exist"? In the specification? Or not at all (ksLookup fails)? In languages without exceptions it is always difficult what to do if someone uses the API wrongly. You are right that there is some danger that some module nobody knows about uses some unspecified configuration settings. We should deal with such issues, but there is no straightforward way:
Code generation is best, it does not have this issue. |
As a user I would like to be able to use three types of keys:
Some thoughts on them (please comment): Type 1For 1 we could use code generation. Additionally to the get and set functions we could also generate a reset function for each key that sets the value back to the default and a isDefault function for each key to check if the current value for the key is the default value. This way we could avoid error handling in those functions altogether. This has the drawback that the user must specify a default value for each key or at least a default value for each type. Alternatively we could pre-define a (language independent?) default value for each type. But let's suppose an application wants to lazily calculate an initial (default) value: if (someSettingIsDefault ())
{
char * initialValue = ... // Calculate the initial value (could also be user-defined).
setSomeSetting (initialValue);
}
char * currentValue = getSomeSetting (); If the calculated initialValue is equal to the default value, the application would calculate the initialValue again and again on each run. Additionally we could generate a function someSettingIsSet (or similar) which could be used like this: if (!someSettingIsSet ())
{
char * initialValue = ... // Calculate the initial value (could also be user-defined).
setSomeSetting (initialValue);
}
char * currentValue = getSomeSetting (); The advantage would be, that it is guaranteed that someSettingIsSet is only true until once setSomeSetting has been called. But what if getSomeSetting is called before setSomeSetting? Type 2Is 2 possible with Elektra in the current version (with contextual values?)? To generate proper code, the keys scheme would have to be known at compile time (and should not be changeable afterwards?). The code generator could then generate functions like "int getSomeSetting(char * name)" and "void setSomeSetting(int value)". Type 3For 3 we can provide typed getters and setters with a parameter for the keys' name. In this case we cannot guarantee that a key with the given name and the correct type (return type of the getter) exists, so there must be a way to handle that error. TLDRIn summary, we cannot use code generation for all three types and even for type 1 there is an edge case. And we still have to think about how we handle errors for arbitrary keys. |
If we know for sure that no error can happen inside the getter (unloaded or failing plugin, etc.), then there is another alternative: int kdbhlHasBoolean (KDBHL * kdbhl, const char * name);
// Unsafe, aborts if key not found.
kdb_boolean_t kdbhlGetBoolean (KDBHL * kdbhl, const char * name); This leave it up to the developer. If he knows for sure that the key exists, then he can omit kdbhlHasBoolean and use the return value of kdbhlGetBoolean directly. |
I did not read/think into all details of your proposal, but nevertheless a short answer for today: I discarded the idea to do calculations or validations within the generated code. So the code generation is exclusively for being sure that the code base and the specification does not co-evolute. (Avoiding names as strings, avoiding calling the wrong API,...) So for being sure that the specification is always available, we would simply generate the C/C++ code for the KeySet of the specification and use this KeySet if we are unable to do
No, we do not know for sure. But it does not matter. On any error we simply use the built-in |
Btw. I am sorry for only giving half answers, but I am really busy the next few days. |
We could use code generated functions for "type 1" keys (keys for which we have a specification) and "generic" getters/setters with an error parameter as proposed for keys which are generated at run-time. In most of the cases applications will use pre-defined keys and therefore code generated functions anyway, I guess. For "type 3" keys we then have a safe API, in the sense that the developer is encouraged by the function header to handle possible errors or explicitly ignore them (by passing NULL as error pointer).
Would this imply a default value that is returned as default value is also returned when the actual value cannot be read because of a malfunction? |
Yes, sounds good to have a generated and a generic API!
Yes, exactly! Of course projects can also decide to abort at program start. |
As promised a longer answer:
We already have been there, it was not a good choice.
Its quite expensive and I do not know if it is reliable, e.g. for out-of-memory.
We could fix the places currently using
It is not really handy to use.
Yes, that is certainly a good choice.
Yes, but that is minor. More problematic is if someone does not check errors where it should have been done. So ideally only a single function will return errors, and all others not. (see below)
If the handle is not thread safe, this option will not be more thread safe.
Yes, that is very nice.
This might be a misfeature and encourage users to have latent errors.
Yes, but are there any? Any idea where we want to have asynchronous functions?
Here are some more details on the duality that the same API function sometimes throws errors, but sometimes is guaranteed not to: Ideally, the user wants to get config values without having to care about errors. And this is possible like getenv demonstrates. It has, however, some assumptions:
Most of these assumptions can be guaranteed using code generation:
The only thing the user can do wrong (as long as we do not exit the program ourselves, which we should not), is to ignore errors at startup.
We can reduce it to (1), because:in (2) the placeholders can moved within the API: the app requests
We always have some cases of API misuse, but yes, I agree the getters should be usable without error handling.
We could simply default to 0, 0.0 and false if no other
Applications should not try to do that. They should use the specification for computation of configuration settings. When they do something internally, they break introspectiveness.
Alternative name: elektraIsDefault I am not convinced if this is needed/a good idea. We should rather focus to always give applications valid values.
Alternative names: elektraIsPresent, elektraExist? Yes, we likely need that for some rare situations where absence of configuration has special meaning (and no default was given on purpose). Afaik we do not need it for lcdproc.
Unless we return a variant (maybe even including the "nothing is there" situation).
Actually only the developer knows for sure. We never know if the API was used correctly. So I would argue for:
(2)-(4) can be guaranteed by code generation, and maybe even by careful programmers. And obviously also our API needs to be very carefully programmed: in the default case we cannot call anything that might fail (including memory allocations!)
Yes, then (2)-(4) is fulfilled.
Yes, except when (1)-(4) is fulfilled, then error checking can be omitted. This is the reason why I would prefer no extra error parameter. There is a similar situation for arrays. When the user carefully iterates from 0 to arrayLength-1, no error can happen. But called with another index we obviously have an error.
Yes, many applications are very similar to lcdproc: nearly everything standard and some small extras.
I do not like the possibility to ignore errors. This could lead to confusing bug reports, thus the real or first source of an error is not caught:
In conclusion:
I updated doc/decisions/high_level_api.md. Some minor points are still open. |
|
Here are some possible ways of error handling in the high level C-API with some pros and cons.
Error handling variants
printf ("number /myKey is %d\n", kdbhlGetInt (kdbhl, "/myKey"));
if (foo()) { // ok } else { // error }
whether the function succeeded. However, he doesn't get information (apart from a possible error code) about what exactly went wrong (maybe a certain plugin failed to transform the data, maybe the key was wrong, etc.).For the reasons stated below each variant, the variants 1, 2 and 3 in my opinion do not qualify for a high level API. Variant 4 might meet the basic requirements, but from a usability point of view it is not very satisfying (due to the lack of information about what exactly caused the error).
Here are two usage examples for 5 and 6 from the users' perspective:
Example (variant 5)
Example (variant 6)
There are several advantages of the latter example (variant 6) over the former (variant 5).
Hence, I propose to implement variant 6 as error handling mechanism in the high level C-API. One drawback could be that the user can leak memory by forgetting to call kdbhlErrorDel(error) after an error occurs. We could try to mitigate this in a future API version by keeping a list of thrown errors within the KDBHL handle (implemented as linked list) and calling kdbhlErrorDel on each of them in the course of kdbhlClose. However, this would require kdbhlOpen to always return a KDBHL handle (as in variant 5) which could be seen as a slight inconsistency but would make kdbhlErrorDel obsolete while still being compatible to older API versions that required kdbhlErrorDel.
The text was updated successfully, but these errors were encountered: