Add Datalake Files endpoints by dtkav · Pull Request #3 · fleetspace/missioncontrol_client

dtkav · 2019-03-18T06:00:15Z

This adds client-side functionality for interfacing with the missioncontrol files endpoints.
Note this requires us to land the changes into missioncontrol first.

Psykar · 2019-03-18T06:28:05Z

+        l.append(self._compressor.flush())
+        return b''.join(l)
+
+    def _calculate_hash(self):


This is used in super().__init__()
I've added the comment below to explain why we need to override it (to hash the raw file, not the gzip stream).

Psykar · 2019-03-18T06:28:52Z

+
+    def _calculate_hash(self):
+        '''ensure the hash is over the raw file, not the gzip steam'''
+        b2 = blake2b(digest_size=16)


Is digest_size required here? Or just to ensure we're consistent?

yeah, I we need both a hashing algorithm and a standard way to call it, otherwise we might have the same file twice with different hash lengths.
Actually, I'll probably encode the hash type and length with something like pymultihash.

Psykar · 2019-03-18T06:29:33Z

+import zlib
+import socket
+import datetime
+from pyblake2 import blake2b


Why not builtin? https://docs.python.org/3/library/hashlib.html#hashlib.blake2b

I'm relying on planetlabs/datalake, which uses pyblake2

I guess I could vendor this functionality actually and make the modifications directly.

Psykar · 2019-03-18T06:32:39Z

+            start = UTC("now").iso
+
+        if where is None:
+            where = socket.gethostname()


Does this do FQDN if it can?

good point - I'll change to getfqdn

Psykar · 2019-03-18T06:32:57Z

+
+        cid = f.metadata["hash"]
+
+        fleetmeta = {


Fleet specific?

I'm inheriting from planetlabs/datalake File class, but we've changed our metadata structure, so this converts to the "fleet" version of metadata.

Ideally we'd fork/enhance this library with metadata 2.0 and distribute it, but using it as is provides a lot of value without much effort in the near term.

Ah right.
At this point I don't actually see what this File class adds to us? Can you explain a bit?

Yeah - I'm trying to leverage as much of the datalake infrastructure as possible, as it includes lessons learned and features developed over several years.
For example, datalake files have a tar bundle format, and an inotify-based auto-uploaded service. If we can re-use a lot of this work, we won't have to rewrite it from scratch.

Except currently we're just doing a POST directly anyway?
I'm wondering what specifically this PR users from the File class of datalake

Maybe the answer is 'nothing yet'?

I think the mismatch is that I was overriding the methods used in this diff to get streaming gzip working.
I've since moved that into a datalake fork. I think it's worth factoring out the metadata and tooling as it's a bit more complex than just normal REST api stuff.
Ideally this library would be very lean and not doing too much magic.

dtkav requested a review from Psykar March 18, 2019 06:00

Psykar requested changes Mar 18, 2019

View reviewed changes

dtkav force-pushed the files_api branch 4 times, most recently from 36689b5 to c19b4db Compare March 19, 2019 06:42

dtkav changed the title ~~wip: files endpoints~~ Add Datalake Files endpoints Mar 20, 2019

dtkav force-pushed the files_api branch from 0af4b5c to 921bea4 Compare March 20, 2019 07:08

Add methods for the /api/v0/files/ endpoints

ad10087

dtkav force-pushed the files_api branch from 921bea4 to ad10087 Compare March 20, 2019 07:09

dtkav requested a review from Psykar March 20, 2019 07:13

Conversation

dtkav commented Mar 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dtkav Mar 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dtkav commented Mar 18, 2019 •

edited

Loading

dtkav Mar 19, 2019 •

edited

Loading