Skip to content

EVENTLOOP: ensure exclusive Gambit thread exists. #418

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

0-8-15
Copy link
Contributor

@0-8-15 0-8-15 commented Apr 29, 2021

At 2020-08-07 Marc Feeley wrote that the situation how lambdanative
uses Gambit (at least on Android) is not supported by Gambit, he
wrote:

Any number of Scheme threads can call C functions, but only one
Scheme thread at a time can do a Scheme to C call that calls back to
Scheme. This is because the C stack is shared by all the Scheme to
C calls... if the Scheme threads T1 and T2 call C (in the order T1
then T2) and if T1 returns to Scheme while T2's call is still in
progress, then T2's C stack frame will be removed when T1's C stack
frame is removed (because T2's frame was added after T1's frame).

Under Android both the UI thread and the OpenGL thread call into
Gambit; all bets are off. Race conditions are not avoidable. This
seems to explain random crashes I observed eventually to the point
that I could reproduce them by tapping the screen fast enough.

The only reasonable escape I could conceive was to use Gambit as Marc
intented: one single thread running contineously.

The upshot is: for Android and X11 based systems running Gambit from a
dedicated thread avoids jitter and event loop related delays for
Gambit threads. In practice this work much smoother.

Status

This change will certainly break some applications:

  1. At least those, where race conditions are exacerbated to sure
    conflicts.
  2. A batch of untested modifications required for compatible with this
    patch is half prepared already. Notably the 'serial' module is
    unlikely to work out of the box.
  3. I assume that those other X11 based system should mostly work. Non
    tested, non available here.

This change avoids many C-to-Scheme calls, which where not properly
protected and runs the event loop in a pthread of its own for X11 and
Android. Under win32 it still calls the previos way, but used
EVENT_IDLE to better wait with the Gambit thread system than block
threads in Windows.

For Android the change includes a new NativeGLSurfaceView and
supporting GLState class. These work so far more reliable than the
version before. But the are premature. Frequently the android app
will come up first time after installation but not again. Forced stop
of the app and restart helps. Further restarts magically work.

Tested are so far on Android, Linux, Windows.

0-8-15 added 15 commits April 29, 2021 15:11
should not copy seconds to nanoseconds and vice versa
Also:
- add another wrapper for glTranslatef

  This one allows to feed it an f32vector, no checks.  Useful for
  upcomming, though unrelated, improvements.

- add wrapper and macros around texture handling
TBD: rename to something like `copy-to-clipboard!` for consistency
with general Scheme naming conventions.
At 2020-08-07 Marc Feeley wrote that the situation how lambdanative
uses Gambit (at least on Android) is not supported by Gambit, he
wrote:

    Any number of Scheme threads can call C functions, but only one
    Scheme thread at a time can do a Scheme to C call that calls back
    to Scheme.  This is because the C stack is shared by all the
    Scheme to C calls... if the Scheme threads T1 and T2 call C (in
    the order T1 then T2) and if T1 returns to Scheme while T2's call
    is still in progress, then T2's C stack frame will be removed when
    T1's C stack frame is removed (because T2's frame was added after
    T1's frame).

Under Android both the UI thread and the OpenGL thread call into
Gambit, all bets are off.  Race conditions are not avoidable.  This
seems to explain random crashes I observed eventually to the point
that I could reproduce them by tapping the screen fast enough.

The only reasonable escape I could conceive was to use Gambit as Marc
intented: one single thread running contineously.

The upshot is: for Android and X11 based systems running Gambit from a
dedicated thread avoids jitter and event loop related delays for
Gambit threads.  In practice this work much smoother.

--- ## Status

This change will certainly break some applications:

1. At least those, where race conditions are exacerbated to sure
   conflicts.
2. A batch of untested modifications required for compatible with this
   patch is half prepared already.  Notably the 'serial' module is
   unlikely to work out of the box.
3. I assume that those other X11 based system should mostly work.  Non
   tested, non available here.

This change avoids many C-to-Scheme calls, which where not properly
protected and runs the event loop in a pthread of its own for X11 and
Android.  Under win32 it still calls the previos way, but used
EVENT_IDLE to better wait with the Gambit thread system than block
threads in Windows.

=======================================================================
For Android the change includes a new NativeGLSurfaceView and
supporting GLState class.  These work so far more reliable than the
version before.  But the are premature.  Frequently the android app
will come up first time after installation but not again.  Forced stop
of the app and restart helps.  Further restarts magically work.
=======================================================================

Tested are so far Android, Linux, Windows.
@mgorges
Copy link
Contributor

mgorges commented Apr 29, 2021

'EVENTLOOP: ensure exclusive Gambit thread exists.' is where the actual changes are, correct - everything else are duplicates of things in #417 ? Breaking serial will be a big problem for us as we rely on this. Also, I am personally concerned about breaking clinical data collection systems which use Linux (without GUI) and which will cause irrecoverable data loss if things go bad.

@0-8-15
Copy link
Contributor Author

0-8-15 commented Apr 30, 2021 via email

@0-8-15 0-8-15 mentioned this pull request Apr 30, 2021
@mgorges
Copy link
Contributor

mgorges commented May 27, 2021

Can't say that I even understand 20% of this new code but I have a few questions/comments:

  1. You disable glgui-timings-at-10msec!, which we use in a lot of apps, including all apps (including non-gui) that use the modules/scheduler/scheduler.scm module and need proper timing? The minimum as per https://github.com/part-cw/lambdanative/pull/418/files#diff-b3b574bd3e4985858199598c697225c69645bbe5420cc7a4402f8e62f5625b04R268 seems to be 0.05 sec sleeps now?
  2. As for your question where glCoreTextureReset is used - it is in glCoreInit
  3. You disable the EVENT_IDLE in event_timer_callback on win32, why? - I think win32 has this annoying behaviour if a GUI app is active but not in focus the main loop will stop running after ~30 seconds (or maybe a minute, can't recall), and when you return to it it throws lots of events and tries to catch up which messes up the output plugins in the scheduler, but this mitigated as introduced in fdc3626 ?
  4. The wrong event code (127 instead of 128) in ANDROID_JAVA_ONDESTROY in https://github.com/part-cw/lambdanative/pull/418/files#diff-2098455a1eec986f28edb431eaec9868112f3f3dc606d46435c07066363a5e11R287 might explain some strange app closing behaviour on Android. Nice catch!
  5. We still need a way to ensure all the other Android things like camera, serial, audio, etc work and I am not sure how to do that short of building this branch of LN and waiting for things to break?

@0-8-15
Copy link
Contributor Author

0-8-15 commented May 28, 2021 via email

0-8-15 added 8 commits July 25, 2021 14:53
Bad local modifications.
new procedure `startup_subst` reads subsitutions from modules and app
expanded into the C `main` procedure.

- MAIN_c_additions ::
  expanded in toplevel file

- MAIN_subcommand_defines

  expanded in initializer of subcommand structure:

      typedef struct subcommand_def {
        const char* name;
        void (*main)(int, const char*argv[]);
      } subcommand_def_t;

  example: for the `fossil` module `MAIN_subcommand_defines` contains:

      { "fossil", fossil_main },
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants