Skip to content

Feature/tiered storage setup#1

Merged
HoJacob merged 15 commits into
feature/tiered_storagefrom
feature/tiered_storage_setup
Jun 25, 2026
Merged

Feature/tiered storage setup#1
HoJacob merged 15 commits into
feature/tiered_storagefrom
feature/tiered_storage_setup

Conversation

@HoJacob

@HoJacob HoJacob commented Jun 4, 2026

Copy link
Copy Markdown
Owner

What type of Pull Request is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Describe your changes in brief

I setup the initial data structures/variables for tier storage and its test file. Worked on OpenFile and its two helper functions, also did the first test for OpenFile

Checklist

  • Tested locally
  • Added new dependencies
  • Updated documentation
  • Added tests

Related Issues

  • Related Issue #
  • Closes #

@jfantinhardesty jfantinhardesty left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! This is a great start to OpenFile and the storage tier component. There are a few more cases to consider and some fixes to work on as well as a few tests to add.

)

var home_dir, _ = os.UserHomeDir()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current tests are not actually running, you need to put the following in the test file:

func TestTieredStorageTestSuite(t *testing.T) {
	suite.Run(t, new(tieredStorageTestSuite))
}


// Structure defining your config parameters
type TieredStorageOptions struct {
// e.g. var1 uint32 `config:"var1"`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we need to add is the local path that we will cache files. Something like this:

	TmpPath      string  `config:"path"        yaml:"path,omitempty"`

return fmt.Errorf("TieredStorage: config error [invalid config attributes]")
}
// Extract values from 'conf' and store them as you wish here

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then in configure we need to setup that local path that we will save files so. So use tmpPath := conf.TmpPath and then make that directory if it doesn't already exist.

info, err := c.GetAttr(internal.GetAttrOptions{Name: options.Name})
if err != nil {
// file does not exist in cloud, return error
return nil, fmt.Errorf("file not found in cloud")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great in terms of thinking about things we want to log for errors, and in most go code this would be correct. But here since we are a filesystem we only want to return certain error types as we will pass these errors down to the OS. So here you would want to log that "file is not found in the cloud" including the name of the file. But the error we would want to return would be the error that GetAttr returned.

Offset: 0,
Count: 0,
File: localFileHandle,
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it may be easier to just defer localFileHandle.Close(), rather than track all of the if cases you need to close the file in.

})
if err != nil {
localFileHandle.Close()
os.Remove(localPath)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, os.Remove could return an error. But if that happens we won't do anything with the error so go convention tells us we should do this:

_ = os.Remove(localPath)

To indicate that we know this function returns something, and we are purposefully not doing anything with the return.

return nil
}

// First Function to work on!!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this extra comment.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sad

flock.Inc()
return handle, nil
}
//2. Check if File exists in Disk, if not check cloud

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One case that should be handled is if OpenFile is opened with O_CREATE flag. In that case we should create the file locally and open it. You can also test this by creating a new test where you just call openFile with the O_CREATE flag and then, like the test you have, check if it is in the cache.

… with no file attatched, also added some config path

@jfantinhardesty jfantinhardesty left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Have a few more things to cleanup but the work and testing is great!

if handleCount == 0 {
//is file cloudbacked
c.mu.Lock()
node, _ := c.fileMap[options.Handle.Path]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the file is not in the map? In this case it would cause the program to panic. It is not an expected case, but probably should handle it gracefully.

},
)
suite.assert.NoError(openErr)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the file handle dirty, you should do a suite.tieredStorage.WriteFile() call or something which then should set the dirty handle.

_, exists := c.fileMap[options.Name]
c.mu.Unlock()
if exists {
//skip to opening file since it should already be in local cache

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically, I'm not a huge fan of an if case that does nothing. I find that it hurts readability more than it helps it.

addedFileSize := newFileSize - existingSize

//if we didn't modify the size of the file then
if addedFileSize <= 0 {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could underflow if the newFileSize is smaller.

@foodprocessor foodprocessor left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent work. I flagged a few minor issues, but from a data structure perspective, this is very good.
As you solidify the design and implement the LRU, I think this will tighten up really well. And then you can work on creating distinct concurrency domains.

Comment on lines +274 to +275
if exists {
} else {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this case is handled below, you could probably remove this empty logical branch.

if !exists {
//2. Check if File exists in Disk, if not check cloud
info, err := os.Stat(filepath.Join(c.tmpPath, options.Name))
if err == nil {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the file isn't in fileMap but it exists, that's unexpected, right?
Maybe it would be good to log a warning here.

}

// openFileHelper : function to download copy from cloud and add to local cache
func (c *TieredStorage) openFileHelper(options internal.OpenFileOptions) error {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function downloads the file, can you change the name to fit?

File: localFileHandle,
})
if err != nil {
localFileHandle.Close()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close() is already deferred above, so it's not needed here.

}

//2. Check if exceeds limits
newSize := options.Offset + int64(len(options.Data))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This math might need some work (what if this write overwrites a small segment inside a large file? what if we're in append mode and the offset is meaningless?)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using your own code/math :)

Comment on lines +226 to +228
c.mu.Lock()
c.fileMap[options.Name] = node
c.mu.Unlock()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This semaphore protects fileMap effectively, but having one lock for the whole component creates performance issues. Not as part of this PR, but at some point (after eviction is implemented), it would be good to evaluate what distinct concurrency domains there are in the component, and how best to protect them without causing unnecessary contention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants