Skip to content

Add baseline glutton#243

Open
Max Smythe (maxsmythe) wants to merge 2 commits into
agent-substrate:mainfrom
maxsmythe:glutton
Open

Add baseline glutton#243
Max Smythe (maxsmythe) wants to merge 2 commits into
agent-substrate:mainfrom
maxsmythe:glutton

Conversation

@maxsmythe

@maxsmythe Max Smythe (maxsmythe) commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Adds the glutton image to support easier load testing contemplated in #98

This is a rough implementation to help bootstrap more repeatable testing and is expected to change over time as we learn more about performance/testing requirements.

@EItanya

Copy link
Copy Markdown
Collaborator

Adds the glutton image

What does this mean?

Can you please add a longer PR description so it's clearer to potential reviewers what's happening here?

@maxsmythe

Copy link
Copy Markdown
Collaborator Author

Sorry, linked the issue to the PR, but not the PR to the issue. Updated description.

@EItanya

Copy link
Copy Markdown
Collaborator

Sorry, linked the issue to the PR, but not the PR to the issue. Updated description.

Thanks :)

@bowei Bowei Du (bowei) left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally lgtm. You might want to document in the RPC defs what each glutton action intends to do.

Comment thread cmd/testing/glutton/main.go Outdated
"sync"
"time"

"github.com/agent-substrate/substrate/internal/proto/glutton"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put Substrate imports on their own block at the end

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, thanks. Fixed (once commit is pushed)

}
}

func (s *gluttonService) WriteRAM(ctx context.Context, req *glutton.WriteRAMRequest) (*glutton.WriteRAMResponse, error) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this makes a huge difference, but this seems to do more than what I think of for writeRAM as it allocates 2x the RAM (if the argument Size is like 32 Gb or something).

Overwrite should just touch bytes but do not extra allocation?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Writing out a larger array will still require swapping the buffer, but we can at least deal with writing to subsets of the buffer.

My main hope with this one is to give us the ability to more easily see the effect of different snapshotting systems.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be helpful to summarize behavior in a 1-2 sentences for the WriteRAM docstring.

continue
}
if resp.GetMessage() != msg {
slog.WarnContext(ctx, "Gossip ping returned unexpected message",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a metric that we bump if glutton encounters errors?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a "outcome" tag to align with extproc.go

@bowei Bowei Du (bowei) self-assigned this Jun 17, 2026
@maxsmythe

Copy link
Copy Markdown
Collaborator Author

Thanks for the review! addressed comments and rebased.

@maxsmythe

Copy link
Copy Markdown
Collaborator Author

Missed the documentation in the protobuf comment, will integrate into automation PR

@bowei Bowei Du (bowei) left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add typing to the Python code? I know Python types are optional, but it does quickly become a maintenance issue.

def on_start(self):
update_user_count(1, self.__class__.__name__)

target = self.api_host.replace("http://", "").replace("https://", "")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably deserves a comment on what this is for

self.api_channel.close()

def _teardown_actor(self):
try:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to track any of these teardown failures?

glutton_pb2.PingRequest(message=msg),
metadata=metadata,
)
duration = (time.time() - start_time) * 1000

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

durationMs?

spec:
runsc:
amd64:
url: "gs://gvisor/releases/nightly/2026-05-19/x86_64/runsc"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want a comment here as to where this reference comes from?

defer serverboot.ShutdownProvider("MeterProvider", mp.Shutdown)

if err := os.MkdirAll(*dataDir, 0o700); err != nil {
serverboot.Fatal(ctx, "Failed to create data directory", err)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the dataDir to the error


lis, err := net.Listen("tcp", *listenAddr)
if err != nil {
serverboot.Fatal(ctx, "Failed to start listener", err)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the listenAddr in the error (for debugging)

fds []*os.File
peers map[string]*peerGossip

ramWriteBytes metric.Int64Counter

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be easier to maintain if we should split off each of these kind of operations into different structs with a abstract interface? then we wouldn't have to throw all of the operation code into this one struct.

something like

type action interface {
  register()
  do()
  stop()
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal preference to also split this into different files to avoid this become a mega file.

the other thing you will want to do is to keep the package main to a minimum and put the real logic into a testable package. You can't add unit tests to the main package.

Move the code to cmd/testing/glutton/internal...

}
}

func (s *gluttonService) WriteRAM(ctx context.Context, req *glutton.WriteRAMRequest) (*glutton.WriteRAMResponse, error) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be helpful to summarize behavior in a 1-2 sentences for the WriteRAM docstring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants