Kicking around ideas for relational-based code management tools.
table: Routines // subroutines or methods ---------------- RoutineID // auto-gen Name Description ClassName // if applicable ModuleName table: Versions ------------------- VersionID // auto-gen Description IsProduction Notes table: RoutineVersions ---------------------- RoutineRef // "ref" means foreign key VersionRef DeveloperRef SourceCode UnitNotes(Some entities, such as developer and modules, not shown.)
If auto-gen numbers are used, then the issue of what the developer sees in the code comes up. Until divorcing of presentation from meaning is truly achieved, then auto-gen may not be ripe for usage. Either that, some kind of translation layer is needed to work in a file-centric world (QwertySyndrome).
Somewhere there is already a similar topic floating around IIRC, but I could not find it.
"Code management"? What's that? Yet another? The only real way to access (extract) attributes from the code is for the tools to penetrate its structure all the way - as in CASE tools, or 4GLs! Abolish "files", translation units, compiled "objects", binaries, and other such non-code, not-programmer's-concern mess. Code = source! Everything else - messing around with "plain" text, syntactic transformations, "filing" systems - is partial solutions, which is no solution at all but yet more problems.
One tool to rule them all, and transparently bind them! It must do everything: edit, run, translate ("compile") automatically and transparently, "manage" versions and projects, (unit-) test and debug, assert and prove... And HyperProgramming?, while at it. (Because LanguageIsAnOs.)
Relational-model? Whatever for? Tables stink. Foreign keys stink. Humans should stick to what they can do a reasonably good job with: design your data (entity-relationship) model, and let the machine handle the rest.
Maybe try the UML's meta-model: I think it already does that.
Relational stinks? I smelleth a HolyWar. Please clarify in ArgumentsAgainstRelational?. UML is too inconsistent in my opinion. And, they are not necessarily orthogonal: UmlSchema. One still needs a way to organize, codify, and transfer UML info between systems and tools. Nor do I think it is necessarily a case of code versus non-code. See SeparateMeaningFromPresentation.
Schema Alternative B
Here's an alternative schema that emphasizes open-ended categorization and the ability to search and query at the token level. This is a bit of a different goal than mere source code management, but also involves code analysis. Issues of version control and author-ship are not addressed here. This schema may need some minor adjustments for different languages.
functions --------- funcID funcName moduleRef funcText // optional, depending on editing technique modules ------- modID modName modType // ex: "class", "procedural" fileRef files --------- fileID fileName filePath surveyTimeStamp Etc... tokens // usually variables ------- tokenID tokName tokType // [1] Ex: full or partial (such as an object path) funcRef blockType // [2] Ex: loop, SQL (embedded) - optional categories -------- categID catName catNotes categAssoc // Maps categories to entities ----------- categRef entity // Ex: "modules" Ref // F.K. to entity IDNotes:
[1] I generally index both the elements of object paths and the entire path. Thus, "foo.bar.glob" would generate 4 token entries: one for the 3-part path, and 3 for each part. There's probably fancier ways to do such, but I'm trying to keep it fairly simple to serve as an example starter framework here.
[2] Depending on language and shop conventions, sometimes a parser can determine the kind of code-block a token was found in, such as a loop, embedded SQL, function declaration section ("private"), etc.
--top
Schema Alternative C
Here is a hierarchical approach to give a potentially more flexible granularity:
table: source_object -------------- objectID parentRef // F.K. to same table objectName objectType // file, module, class, function, struct, etc. categAssoc // Maps categories to code objects ----------- categRef // (see prior "categories" table) objectRef // F.K. to source_objectAn AttributeTable may be needed for specific object types, or we could have dedicated tables for those against AttributeTable's.
As far as how to syntactically mark programming code regions with aspects, one idea is to use a markup-based language, similar to ColdFusionMarkupLanguage?, for one's imperative programming. Although markup can be wordy, it does offer the ability to easily mark aspects. Example:
<if flag==7 aspects="foo,bar"> <!-- block marking --> <set x=4> <set y=77 aspects="glob,smerg,grib"> <call myRoutine(x,y) aspects="smiff"> <call name="myRoutine" param1=x param2=y aspects="smiff"> <!-- alternative to above --> </if>A dedicated "aspect" tag can be used to group statements that don't naturally fall into another block.
See also: RunTimeEngineSchema, SeparationAndGroupingAreArchaicConcepts, TableOrientedCodeManagementDiscussion