Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High level C-API error handling #1359

Closed
domhof opened this issue Feb 20, 2017 · 14 comments
Closed

High level C-API error handling #1359

domhof opened this issue Feb 20, 2017 · 14 comments
Projects

Comments

@domhof
Copy link
Contributor

domhof commented Feb 20, 2017

Here are some possible ways of error handling in the high level C-API with some pros and cons.

Error handling variants

  1. Standard errno-Pattern / global error buffer
    • Not thread safe (in C11 though, errno is thread local).
    • The user has to reset errno to zero before each function call.
    • Only documentation can state whether the function can set an errno or not (it is not obvious from the function header alone).
  2. Non-local Jumps (setjmp) to simulate try-catch behavior
    • Rather obscure way of implementing exceptions for C.
    • Not very common and also hard to get it right (bug free, possibly platform independent) implementation wise.
  3. Return a special value indicating an error, return actual value otherwise.
    • For certain types (e.g. int) we loose one valid value.
    • The value indicating the error is return type dependent for each function.
    • It is easy to forget to handle the special return value and instead use the returned value directly. E.g. printf ("number /myKey is %d\n", kdbhlGetInt (kdbhl, "/myKey"));
  4. Return error code as function result, store actual function result in variable passed as argument.
    • The user can easily check with if (foo()) { // ok } else { // error } whether the function succeeded. However, he doesn't get information (apart from a possible error code) about what exactly went wrong (maybe a certain plugin failed to transform the data, maybe the key was wrong, etc.).
  5. Store error-struct in the KDBHL handle. Functions return a boolean value indicating if the function succeeded. If the function fails, it sets the error in the KDBHL handle and returns -1. Otherwise the result is stored in a variable passed as argument.
    • To be thread safe, the handle must not be shared across threads.
    • The user doesn't know from the function header whether a function sets the error variable in the handler or not. This information can only be provided as documentation.
    • As a consequence one might check the wrong error if he expects a function to set the error variable when in fact it doesn't.
    • As with the errno-pattern, the user must reset the error-struct before each function call otherwise he might mistakenly interpret a function as failed while it actually succeeded.
  6. Like 5. but the error is not stored in the KDBHL handle, but is stored in a variable passed as argument as well.
    • Thread safe since the error variable is local.
    • The user has to declare the local error variable before calling the function.
    • The function header explicitly states that an error can be thrown. Therefor it is clear to the user without having to look into the documentation.
    • The user can reuse the error variable if he wants to, or he can store it for later use and use a separate one for subsequent function calls.
    • The user doesn't have to handle the error immediately.

For the reasons stated below each variant, the variants 1, 2 and 3 in my opinion do not qualify for a high level API. Variant 4 might meet the basic requirements, but from a usability point of view it is not very satisfying (due to the lack of information about what exactly caused the error).

Here are two usage examples for 5 and 6 from the users' perspective:

Example (variant 5)

KDBHL * kdbhl = kdbhlOpen ("/sw/elektra/kdb/#0/current");
if (kdbhlHasError (kdbhl))
{
    const char * errorMessage  = kdbhlGetErrorMessage (kdbhl);
    printf ("An error occurred: %s\n", errorMessage);
    kdbhlClose (kdbhl); // Required to free memory allocated by the internal error struct.
    return;
}

int value;

// Get int value for myKey1.
if (kdbhlGetInt (kdbhl, "/myKey1", &value)) 
{
    printf ("number /myKey1 is %d\n", value);
} 
else
{
    const char * errorMessage = kdbhlGetErrorMessage (kdbhl);
    printf ("An error occurred: %s\n", errorMessage);
    kdbhlClear (kdbhl);
}

// Get int value for myKey2.
if (kdbhlGetInt (kdbhl, "/myKey2", &value)) 
{
    printf ("number /myKey2 is %d\n", value);
} 
else
{
    const char * errorMessage = kdbhlGetErrorMessage (kdbhl);
    printf ("An error occurred: %s\n", errorMessage);
    kdbhlClear (kdbhl);
}

kdbhlClose (kdbhl);

Example (variant 6)

KDBHLError * error = NULL;

KDBHL * kdbhl;
if (!kdbhlOpen ("/sw/elektra/kdb/#0/current", &kdbhl, &error))
{
    printf ("An error occurred: %s\n", error->msg);
    kdbhlErrorDel (error);
    return;
}

int value;

// Get int value for myKey1.
if (kdbhlGetInt (kdbhl, "/myKey1", &value, &error))
{
    printf ("number /myKey1 is %d\n", value);
} 
else 
{
    printf ("An error occurred: %s\n", error->msg);
    kdbhlErrorDel (error);
}

// Get int value for myKey2.
if (kdbhlGetInt (kdbhl, "/myKey2", &value, &error))
{
    printf ("number /myKey2 is %d\n", value);
} 
else 
{
    printf ("An error occurred: %s\n", error->msg);
    kdbhlErrorDel (error);
}

kdbhlClose (kdbhl);

There are several advantages of the latter example (variant 6) over the former (variant 5).

  • The function headers clearly state whether a function can throw an error or not.
  • The error handling is done consistently throughout the API.
  • If kdbhlOpen fails, no KDBHL handler has to be created, therefore it also doesn't have to be cleared in the error case (like it has to be in variant 4).
  • The error object is independent from the KDBHL handle.
  • If there are asynchronous functions in the API, the user can simply pass different KDBHLError objects to the functions and gets thread safe error handling (which would not be possible with one central error object per KDBHL handle).
  • The user can explicitly ignore errors by passing NULL as error address to the function (which is not possible if a central error object is used).

Hence, I propose to implement variant 6 as error handling mechanism in the high level C-API. One drawback could be that the user can leak memory by forgetting to call kdbhlErrorDel(error) after an error occurs. We could try to mitigate this in a future API version by keeping a list of thrown errors within the KDBHL handle (implemented as linked list) and calling kdbhlErrorDel on each of them in the course of kdbhlClose. However, this would require kdbhlOpen to always return a KDBHL handle (as in variant 5) which could be seen as a slight inconsistency but would make kdbhlErrorDel obsolete while still being compatible to older API versions that required kdbhlErrorDel.

@markus2330
Copy link
Contributor

markus2330 commented Feb 20, 2017

Thank you for your detailed analysis!

A short answer for now, longer later:

The function headers clearly state whether a function can throw an error or not.

Unfortunately that is not really the case because possible errors are not only dependent on the API functions. If the user has build-in spec keys, he/she has the guarantee to not get an error for getters like kdbhlGetInt. And I would suggest that applications use this guarantee! Thus I suggested to have the int directly as return value.

The error object is independent from the KDBHL handle.

Can you elaborate on which advantage this has? (Except of the kdbhlOpen case: but actually we should never fail to open except in completely broken installations, and maybe not even there, see below).

We could try to mitigate this in a future API version by keeping a list of thrown errors within the KDBHL handle

Of course the API needs to be extensible, but we need to decide about which paradigm we want to have now. I like the idea to collect everything that happened within the KDBHL handle. Then even sloppy API users would not miss any error.

However, this would require kdbhlOpen to always return a KDBHL handle

Yes, why not do that? With build-in spec-keys an application will work correctly even if KDB is completely corrupted.

Btw. we should also consider better names than KDBHL before this name spreads over everywhere. Any suggestions?

@tom-wa Can you also share your thoughts?

@domhof
Copy link
Contributor Author

domhof commented Feb 20, 2017

Unfortunately that is not really the case because possible errors are not only dependent on the API functions. If the user has build-in spec keys, he/she has the guarantee to not get an error for getters like kdbhlGetInt. And I would suggest that applications use this guarantee! Thus I suggested to have the int directly as return value.

Great if it can be guaranteed that getters won't fail, then we can just remove the error parameter and return the int instead. But what if a plugin fails (can a plugin be removed/changed at runtime)? What if the passed in key, e.g. "/myKey1" does not exist and there is no default value defined in the spec? What would be the return value then?

Can you elaborate on which advantage this has? (Except of the kdbhlOpen case: but actually we should never fail to open except in completely broken installations, and maybe not even there, see below).

Separation of concerns and thread safety. If there is a KDBHL handle, as a user I would expect it to be a valid one. Why would the handle store an error? With a separate error object, the KDBHL handle could potentially be used on different threads (in future versions) which would be impossible if the handle stored one error only (for all threads).

We should not think of future APIs when designing an API now ;) It is a good idea to collect everything that happened within the KDBHL handle. Then even sloppy API users would not miss any error.

Honestly, I think extensibility is one of the things that make an API great. ;-) So we should think about what would be possible and what would not be possible if we implemented version 1.0 of the API in a certain way.

Yes, why not do that? With build-in spec-keys an application will work correctly even if KDB is completely corrupted.

What do you mean by "build-in spec keys"? If it is impossible that kdbOpen fails then we can remove the error parameter from kdbOpen and always return a handle of course. But is it really impossible? In doc/decisions/high_level_api.md you stated as comment to kdbOpen: "// might fail, you need to check for error afterwards!".

Btw. we should also consider better names than KDBHL before this name spreads over everywhere. Any suggestions?

What if we just rename KDB to KDBCore (or similar...) and KDBHL to KDB? I think this might make sense, since as you mentioned, KDBHL should be the default API that people will use (instead of the current KDB). As a user I would expect the default API to be named like that.

@markus2330
Copy link
Contributor

But what if a plugin fails (can a plugin be removed/changed at runtime)?

kdbGet should also not fail under normal circumstances, but of course it can happen when the plugins do things they fail in.

What if the passed in key, e.g. "/myKey1" does not exist and there is no default value defined in the spec? What would be the return value then?

Obviously the spec should contain a default or have some calculation rules. We should provide a tool that checks if specs are correct.

Separation of concerns and thread safety.

If KDBHL is not thread safe, there is no way to use the API concurrently and the separated error does not yield a benefit.

If there is a KDBHL handle, as a user I would expect it to be a valid one.

In some sense it is a valid one, we could even try to reopen on the next kdbGet. (But I cannot really think of issues that have a good chance of repairing itself.)

Why would the handle store an error? With a separate error object, the KDBHL handle could potentially be used on different threads (in future versions) which would be impossible if the handle stored one error only (for all threads).

We should define concurrency paradigm/semantics now. If we want to do it in a thread-safe way, there is no reason to not do it now. Then we do not need a handle at all and could use global variables. (I would not recommend to do that, but it is like it is currently done in lcdproc)

Btw. it is not impossible to have a concurrent KDBHL even if the error is stored within KDBHL: It could be stored in a thread-local variable. But again: Such semantics cannot be changed later, we need to fixate them now (or till release).

Honestly, I think extensibility is one of the things that make an API great. ;-) So we should think about what would be possible and what would not be possible if we implemented version 1.0 of the API in a certain way.

Yes, absolutely, I fully agree! (Thus I clarified my statement later) But changing threading semantics/paradigms is not about extensibility. Either we make it multi-thread safe or not. (Reason: Unfortunately, it is in general not possible to go from a sequential model to a concurrent one and many decisions depend on if we make the API concurrent.)

But is it really impossible? In doc/decisions/high_level_api.md you stated as comment to kdbOpen: "// might fail, you need to check for error afterwards!".

If it is impossible or not needs to be defined by us; it is not something given.

The idea of this comment is to have a spec-checker integrated within kdbOpen. So if kdbGet of spec fails, or the config is invalid, the user will get an error. It is a good idea to fail early, which is a very strong argument to also have kdbOpen fail on invalid configurations/specifications.

default API

For me it is yet unclear if there is such a thing as a default API or if we need to provide multiple high-level APIs for different use cases. (like easy-to-use vs. multi-core-efficient; and key/value vs. structured data access).

@markus2330
Copy link
Contributor

Thanks for the discussion!

I think for a first version it is best if we try to detect all issues at open, then we can assume that the other get-API calls do not yield errors.

@domhof
Copy link
Contributor Author

domhof commented Feb 26, 2017

I think for a first version it is best if we try to detect all issues at open, then we can assume that the other get-API calls do not yield errors.

How do we handle the "key doesn't exit" case?

@markus2330
Copy link
Contributor

Do you mean "exist"? In the specification? Or not at all (ksLookup fails)?

In languages without exceptions it is always difficult what to do if someone uses the API wrongly.

You are right that there is some danger that some module nobody knows about uses some unspecified configuration settings. We should deal with such issues, but there is no straightforward way:

  • having a default parameter to be used in this case (+a checker if a key exists), problem: duplicated info in spec+code
  • returning an error in get. Problem: all 500 getters would need an if to check for the error. We would simply move the problem to the application -> not so nice.
  • aborting the process from a library is also not nice
  • remember that there was an error and continue to the next error checkpoint? quite risky
  • sending a signal to the process could work for lcdproc?

Code generation is best, it does not have this issue.

@domhof
Copy link
Contributor Author

domhof commented Feb 26, 2017

As a user I would like to be able to use three types of keys:

  1. A key with a predefined name where the name and the value-type are known at compile time.
  2. A key with a name calculated according to a scheme, e.g. "/sw/org/myApp/user/%username%/someSetting" where the type of the value can be specified in the spec and is known at compile time.
  3. A key with a user-defined (or otherwise calculated) name and type (both are not known at compile time).

Some thoughts on them (please comment):

Type 1

For 1 we could use code generation. Additionally to the get and set functions we could also generate a reset function for each key that sets the value back to the default and a isDefault function for each key to check if the current value for the key is the default value. This way we could avoid error handling in those functions altogether. This has the drawback that the user must specify a default value for each key or at least a default value for each type. Alternatively we could pre-define a (language independent?) default value for each type.

But let's suppose an application wants to lazily calculate an initial (default) value:

if (someSettingIsDefault ()) 
{
  char * initialValue = ... // Calculate the initial value (could also be user-defined).
  setSomeSetting (initialValue);
}

char * currentValue = getSomeSetting ();

If the calculated initialValue is equal to the default value, the application would calculate the initialValue again and again on each run.

Additionally we could generate a function someSettingIsSet (or similar) which could be used like this:

if (!someSettingIsSet ()) 
{
  char * initialValue = ... // Calculate the initial value (could also be user-defined).
  setSomeSetting (initialValue);
}

char * currentValue = getSomeSetting ();

The advantage would be, that it is guaranteed that someSettingIsSet is only true until once setSomeSetting has been called. But what if getSomeSetting is called before setSomeSetting?

Type 2

Is 2 possible with Elektra in the current version (with contextual values?)? To generate proper code, the keys scheme would have to be known at compile time (and should not be changeable afterwards?). The code generator could then generate functions like "int getSomeSetting(char * name)" and "void setSomeSetting(int value)".

Type 3

For 3 we can provide typed getters and setters with a parameter for the keys' name. In this case we cannot guarantee that a key with the given name and the correct type (return type of the getter) exists, so there must be a way to handle that error.

TLDR

In summary, we cannot use code generation for all three types and even for type 1 there is an edge case. And we still have to think about how we handle errors for arbitrary keys.

@domhof
Copy link
Contributor Author

domhof commented Feb 26, 2017

If we know for sure that no error can happen inside the getter (unloaded or failing plugin, etc.), then there is another alternative:

int kdbhlHasBoolean (KDBHL * kdbhl, const char * name);

// Unsafe, aborts if key not found.
kdb_boolean_t kdbhlGetBoolean (KDBHL * kdbhl, const char * name);

This leave it up to the developer. If he knows for sure that the key exists, then he can omit kdbhlHasBoolean and use the return value of kdbhlGetBoolean directly.

@markus2330
Copy link
Contributor

I did not read/think into all details of your proposal, but nevertheless a short answer for today:

I discarded the idea to do calculations or validations within the generated code.
This would make the behavior of applications using the code generator different to applications not using the code generator (such as the kdb-tool). And many benchmarks showed the same result: these calculations/validation at run-time hardly have any influence.

So the code generation is exclusively for being sure that the code base and the specification does not co-evolute. (Avoiding names as strings, avoiding calling the wrong API,...)

So for being sure that the specification is always available, we would simply generate the C/C++ code for the KeySet of the specification and use this KeySet if we are unable to do kdbGet or kdbOpen. This approach also works for applications that use the simple (not code-generated API).

If we know for sure that no error can happen inside the getter

No, we do not know for sure. But it does not matter. On any error we simply use the built-in KeySet.

@markus2330
Copy link
Contributor

Btw. I am sorry for only giving half answers, but I am really busy the next few days.

@domhof
Copy link
Contributor Author

domhof commented Feb 26, 2017

So the code generation is exclusively for being sure that the code base and the specification does not co-evolute. (Avoiding names as strings, avoiding calling the wrong API,...)

We could use code generated functions for "type 1" keys (keys for which we have a specification) and "generic" getters/setters with an error parameter as proposed for keys which are generated at run-time. In most of the cases applications will use pre-defined keys and therefore code generated functions anyway, I guess. For "type 3" keys we then have a safe API, in the sense that the developer is encouraged by the function header to handle possible errors or explicitly ignore them (by passing NULL as error pointer).

No, we do not know for sure. But it does not matter. On any error we simply use the built-in KeySet.

Would this imply a default value that is returned as default value is also returned when the actual value cannot be read because of a malfunction?

@markus2330
Copy link
Contributor

Yes, sounds good to have a generated and a generic API!

Would this imply a default value that is returned as default value is also returned when the actual value cannot be read because of a malfunction?

Yes, exactly! Of course projects can also decide to abort at program start.

@markus2330 markus2330 added this to to discuss in lcdproc Mar 1, 2017
markus2330 pushed a commit that referenced this issue Mar 5, 2017
@markus2330
Copy link
Contributor

As promised a longer answer:

errno-Pattern / global error buffer

We already have been there, it was not a good choice.

Non-local Jumps (setjmp) to simulate try-catch behavior

Its quite expensive and I do not know if it is reliable, e.g. for out-of-memory.

For certain types (e.g. int) we loose one valid value.

We could fix the places currently using ssize_t to only use one value and not a bit.

Return error code as function result

It is not really handy to use.

Store error-struct in the KDBHL handle.

Yes, that is certainly a good choice.

This information can only be provided as documentation. As a consequence one might check the wrong error if he expects a function to set the error variable when in fact it doesn't.

Yes, but that is minor. More problematic is if someone does not check errors where it should have been done. So ideally only a single function will return errors, and all others not. (see below)

stored in a variable passed as argument: Thread safe since the error variable is local.

If the handle is not thread safe, this option will not be more thread safe.

The function header explicitly states that an error can be thrown. Therefor it is clear to the user without having to look into the documentation.

Yes, that is very nice.

The user doesn't have to handle the error immediately.

This might be a misfeature and encourage users to have latent errors.

If there are asynchronous functions in the API

Yes, but are there any? Any idea where we want to have asynchronous functions?

How do we handle the "key doesn't exit" case?

Here are some more details on the duality that the same API function sometimes throws errors, but sometimes is guaranteed not to:

Ideally, the user wants to get config values without having to care about errors. And this is possible like getenv demonstrates. It has, however, some assumptions:

  • We setup everything at program start and on errors we already fail then
  • We either can return null, or are guaranteed to have a default value

Most of these assumptions can be guaranteed using code generation:

  • If we are unable to open/get keys, we use the ones built-in
  • If no key is present, we use the default one (from the specification)

The only thing the user can do wrong (as long as we do not exit the program ourselves, which we should not), is to ignore errors at startup.

As a user I would like to be able to use three types of keys.

We can reduce it to (1), because:in (2) the placeholders can moved within the API: the app requests someSetting and the %username% is already substituted internally: The application does not need to care about it. In (3) we use variants as return type.

This way we could avoid error handling in those functions altogether.

We always have some cases of API misuse, but yes, I agree the getters should be usable without error handling.

This has the drawback that the user must specify a default value for each key or at least a default value for each type. Alternatively we could pre-define a (language independent?) default value for each type.

We could simply default to 0, 0.0 and false if no other default is given. But I would prefer to have defaults explicitly written. We could give an error during code generation if no default was given.

But let's suppose an application wants to lazily calculate an initial (default) value

Applications should not try to do that. They should use the specification for computation of configuration settings. When they do something internally, they break introspectiveness. kdb get should always return what the application would get, and this obviously won't work if the application internally calculates or modifies values.

someSettingIsDefault

Alternative name: elektraIsDefault

I am not convinced if this is needed/a good idea. We should rather focus to always give applications valid values.

someSettingIsSet

Alternative names: elektraIsPresent, elektraExist?

Yes, we likely need that for some rare situations where absence of configuration has special meaning (and no default was given on purpose). Afaik we do not need it for lcdproc.

For 3 we can provide typed getters and setters with a parameter for the keys' name. In this case we cannot guarantee that a key with the given name and the correct type (return type of the getter) exists, so there must be a way to handle that error.

Unless we return a variant (maybe even including the "nothing is there" situation).

If we know for sure that no error can happen inside the getter

Actually only the developer knows for sure. We never know if the API was used correctly. So I would argue for:

Elektra * handle = elektraOpen ("myappname");
if (elektraHasError(handle))
{
   // user should exit here!
}
elektraGetLong(handle, "name");
// if user knows that (1) he exited before, (2) "name" is part of the specification, (3) it has type long, and (4) it has an default, then no error handling is needed

(2)-(4) can be guaranteed by code generation, and maybe even by careful programmers.
For (1) we need careful programmers.

And obviously also our API needs to be very carefully programmed: in the default case we cannot call anything that might fail (including memory allocations!)

We could use code generated functions for "type 1" keys (keys for which we have a specification)

Yes, then (2)-(4) is fulfilled.

and "generic" getters/setters with an error parameter as proposed for keys which are generated at run-time.

Yes, except when (1)-(4) is fulfilled, then error checking can be omitted.

This is the reason why I would prefer no extra error parameter.

There is a similar situation for arrays. When the user carefully iterates from 0 to arrayLength-1, no error can happen. But called with another index we obviously have an error.

In most of the cases applications will use pre-defined keys and therefore code generated functions anyway, I guess.

Yes, many applications are very similar to lcdproc: nearly everything standard and some small extras.

For "type 3" keys we then have a safe API, in the sense that the developer is encouraged by the function header to handle possible errors or explicitly ignore them (by passing NULL as error pointer).

I do not like the possibility to ignore errors. This could lead to confusing bug reports, thus the real or first source of an error is not caught:

for (int i=0; i<=elektraArraySize(handle, "name", error); ++i)
{
   elektraArrayLong (handle, "name", i, 0); // no error field passed
}
if (hasError(error))
// ..

In conclusion:

  • I would avoid an extra parameter, let us pass errors via handle
  • Lets give the users a variety of guarantees for not yielding errors if the user did everything correctly (or used code generation)

I updated doc/decisions/high_level_api.md. Some minor points are still open.

@markus2330
Copy link
Contributor

markus2330 commented Mar 16, 2017

  • we might add hasKey hasDefault (exists) method (if needed)
  • goal: no errors in specified keys with default
  • keys without default: fail if key does not exist at startup
  • keys always need type explicitly (type reconstruction maybe later)
  • in case of getting invalid types or non-existing keys, we call exit() (calling other function pointer maybe later)
  • extra error parameter

@markus2330 markus2330 moved this from to discuss to to implement in lcdproc Mar 16, 2017
markus2330 pushed a commit that referenced this issue Mar 17, 2017
@markus2330 markus2330 moved this from to implement to implemented in lcdproc Jul 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
lcdproc
implemented
Development

No branches or pull requests

2 participants