...
Rationale: Storing instructions in the file system as static JSONB objects moves state information away from the central repository of information about a job, confuses users, and does not play well with dynamic instances or state queries.
Overall Design:
*Furthermore, the file system is an unencapsulated stateful location, with stateful permissions and strong uniqueness constraints. All this put together creates a volatile environment with strong imposed restrictions on the logic surrounding instructions.
What this all means is this: Instructions must be uniquely identified by a name (the file name), the running process must have permission to move the files, and there must be some guarantee that the files won’t move while the process is running (else a failure may arise when it looks for them to finish things up). All in all, having instructions on the file system imposes business logic, and creates complexity that is unneeded.
Overall Design
Goals
→ Decouple webservices and filesystem by having web-service talk with a scheduler
*→ Scheduler and database aware of ‘job’ being requested allows for more dynamic processing
*→ Dynamic processing and postgres storage allow more real-time status updates, facilitating updates to the Job Status redesign
Steps
Phase 1: Proof Of Concept of Direct Instruction Injection
At this phase, we look into the difficulty of modifying the webservice layer in such a way as it immediately kicks off a process thread, much the same way BERT currently directly processes jobs. This will include passing the job data directly instead of through a JSON file.
Phase 2: Scheduler/Load Balancer
Once the Phase one prototype is complete, the difficulties of adding a scheduler into the mix can be researched pretty easily. During this phase, we hope to identify a scheduler/load balancer that can work well with spot instances and other cloud improvements.
Phase 3: Instruction database table and interface
At this point, we can investigate efficient storage of the instruction data structures as well as live updates from the scheduled/balanced jobs onto this structure, as well as how this structure relates to the upcoming job table redesign.
Post-Phase 3: Finish
→ Interfacing with new Job Status redesign.
→ Testing/fixing issues found in prototype.
→ Integration with main codebase.
...